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Abstract 

We  consider  the  control  of  interacting  subsystems  whose  dynamics  and  constraints 
are  uncoupled,  but  whose  state  vectors  are  coupled  non-separably  in  a  single  centralized 
cost  function  of  a  finite  horizon  optimal  control  problem.  For  a  given  centralized 
cost  structure,  we  generate  distributed  optimal  control  problems  for  each  subsystem 
and  establish  that  the  distributed  receding  horizon  implementation  is  asymptotically 
stabilizing.  The  communication  requirements  between  subsystems  with  coupling  in  the 
cost  function  are  that  each  subsystem  obtain  the  previous  optimal  control  trajectory  of 
those  subsystems  at  each  receding  horizon  update.  The  key  requirements  for  stability 
are  that  each  distributed  optimal  control  not  deviate  too  far  from  the  previous  optimal 
control,  and  that  the  receding  horizon  updates  happen  sufficiently  fast.  The  theory  is 
applied  in  simulation  for  stabilization  of  a  formation  of  vehicles. 
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1  Introduction 


We  are  interested  in  the  control  of  a  set  of  dynamically  decoupled  subsystems  that  are 
required  to  perform  a  cooperative  task.  An  example  of  such  a  situation  is  a  group  of  vehicles 
cooperatively  converging  to  a  desired  formation,  as  explored  in  Olfati-Saber  et  al  [23],  Dunbar 
and  Murray  [10],  Ren  and  Beard  [25],  and  Leonard  and  Fiorelli  [17].  One  control  approach 
that  accommodates  a  general  cooperative  objective  is  receding  horizon  control.  In  receding 
horizon  control,  or  model  predictive  control,  the  current  control  action  is  determined  by 
solving  on-line,  at  each  sampling  instant,  a  finite  horizon  open-loop  optimal  control  problem. 
Each  optimization  yields  an  open-loop  optimal  control  trajectory  and  the  initial  portion  of 
the  trajectory  is  applied  to  the  system  until  the  next  sampling  instant.  A  survey  of  receding 
horizon  control  is  given  by  Mayne  et  al  [18].  For  the  problem  of  interest  here,  the  cooperation 
between  subsystems  can  be  incorporated  in  the  optimal  control  problem  by  including  terms 
in  the  cost  function  that  depend  on  their  respective  state  vectors,  as  is  done  in  [10]  and  [23]. 
It  is  presumed  that  at  least  some  of  the  terms  that  couple  states  of  cooperating  subsystems 
are  non-separablc,  i.e.,  not  additively  separable.  Otherwise,  the  subsystems  would  not  be 
directly  cooperating,  in  the  sense  that  their  optimal  controls  are  not  directly  influenced  by 
the  state  of  the  other  subsystem.  Henceforth,  we  refer  to  each  subsystem  as  an  agent  and 
any  two  agents  that  are  cooperating  are  referred  to  as  neighbors.  Thus,  neighbors  have  a 
non-separable  term  coupling  their  states  in  the  single,  centralized  cost  function.  Aside  from 
being  able  to  handle  the  cooperative  performance  objective,  the  receding  horizon  control 
approach  is  particularly  useful  when  the  individual  subsystems  are  also  required  to  satisfy 
state  and  control  constraints,  as  is  the  case  in  general  for  vehicles. 

A  drawback  of  the  receding  horizon  control  approach  to  our  problem  is  that  currently  only 
a  centralized  solution  and  implementation  can  guarantee  asymptotic  stability  theoretically. 
However,  a  distributed  solution  to  the  problem  is  desirable,  for  autonomy  of  the  individual 
subsystems  and  for  potential  scalability  and  improved  tractability  of  the  approach.  In  that 
case,  each  agent  would  be  assigned  its  own  optimal  control  problem,  implemented  in  a 
distributed  receding  horizon  fashion. 

Previous  work  on  distributed  receding  horizon  control  include  Jia  and  Krogh  [14],  Motee 
and  Sayyar-Rodsaru  [22]  and  Acar  [1],  In  all  of  these  papers,  the  cost  is  quadratic  and 
separable,  while  the  dynamics  are  discrete-time,  linear,  time-invariant  and  coupled.  Further, 
state  and  input  constraints  are  not  included,  aside  from  a  stability  constraint  in  [14]  that 
permits  state  information  exchanged  between  the  agents  to  be  delayed  by  one  update  period. 
In  another  work,  Jia  and  Krogh  [15]  solve  a  min- max  problem  for  each  agent,  where  again 
coupling  comes  in  the  dynamics  and  the  neighboring  agent  states  are  treated  as  bounded 
disturbances.  Stability  is  obtained  by  contracting  each  agents  state  constraint  set  at  each 
sample  period,  until  the  objective  set  is  reached.  As  such,  stability  does  not  depend  on 
information  updates  with  neighboring  agents,  although  such  updates  may  improve  perfor¬ 
mance.  More  recently,  Keviczky  et  al  [16]  have  formulated  a  distributed  model  predictive 
scheme  where  each  agent  optimizes  locally  for  itself  and  every  neighbor  at  each  update.  By 
this  formulation,  feasibility  becomes  difficult  to  ensure,  and  no  proof  of  stability  is  provided. 
The  authors  also  consider  a  hierarchical  scheme,  similar  to  that  in  [19],  where  the  scheme 
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depends  on  a  particular  interconnection  graph  structure  (e.g.,  no  cycles  are  permitted). 

In  this  paper,  we  start  with  an  asymptotically  stabilizing  centralized  receding  horizon 
control  law,  based  on  the  problem  formulation  and  results  in  the  dissertation  of  Chen  [5], 
which  is  summarized  in  [6].  The  performance  objective  is  relevant  for  a  multi- vehicle  for¬ 
mation  stabilization  problem.  The  centralized  integrated  cost  is  then  decomposed  to  define 
distributed  integrated  costs,  and  asymptotic  stability  is  proven  under  stated  conditions.  Key 
requirements  for  stability  are  that  the  receding  horizon  updates  happen  sufficiently  fast,  and 
each  distributed  optimal  control  trajectory  is  required  to  not  deviate  too  far  from  the  pre¬ 
vious  optimal  trajectory,  over  the  optimized  horizon  time.  We  should  emphasize  that  the 
multi-vehicle  formation  stabilization  problem  is  simply  a  venue.  In  other  problems  where 
the  centralized  integrated  cost  can  be  decomposed  in  the  same  way,  namely  such  that  the 
summation  of  the  distributed  costs  recovers  the  centralized  cost,  the  approach  is  applicable. 
With  slight  modification  to  the  theory  in  this  paper,  the  dynamics  of  the  individual  subsys¬ 
tems  need  not  be  linear  or  homogeneous,  i.e. ,  all  subsystems  could  have  different,  nonlinear 
dynamics.  Such  extensions  are  worked  out  elsewhere  [9]. 

In  our  distributed  approach,  no  communication  is  required  between  agents  while  the  dis¬ 
tributed  optimal  control  problems  are  being  solved.  This  is  an  advantage  over  parallelization 
methods  [3],  where  every  distributed  optimization  must  communicate  with  neighboring  op¬ 
timizations  while  iterating.  Thus,  the  approach  here  would  incur  less  computational  and 
communication  delay  effects  than  an  approach  using  receding  horizon  control  with  paral¬ 
lelization  methods.  On  the  other  hand,  parallelization  can  guarantee  convergence  to  the 
centralized  solution,  whereas  the  distributed  receding  horizon  controller  here,  while  stabiliz¬ 
ing,  will  perform  differently  in  general  than  the  centralized  receding  horizon  controller. 

The  organization  of  the  paper  is  as  follows.  Section  2  defines  the  control  objective  and 
cost  function  used  in  the  optimal  control  problem.  The  cost  function  is  relevant  for  multiple 
vehicle  formation  stabilization.  We  note  that  the  stability  results  are  facilitated  by  this  choice 
of  cost,  but  the  results  hold  for  any  cost  with  a  similar  decomposable  structure.  Section  3 
defines  the  optimal  control  problem  and  reviews  requirements  for  asymptotic  stability  of  the 
centralized  receding  horizon  control  law.  Section  4  details  the  distributed  receding  horizon 
implementation  and  a  proof  of  asymptotic  stability.  Simulation  results  of  a  multi-vehicle 
formation  are  then  given  in  Section  5.  Finally,  Section  6  discusses  conclusions  and  extensions. 

2  Formation  Stabilization  Objective 

In  this  section,  we  present  the  system  dynamics  and  constraints  and  define  the  control 
objective.  To  facilitate  the  analysis,  we  consider  only  linear  dynamics  here,  although  the 
nonlinear  case  is  treated  elsewhere  [9]. 

We  wish  to  stabilize  a  group  of  agents  toward  a  common  objective  in  a  cooperative  way 
using  receding  horizon  control.  Each  agent  is  assumed  to  have  dynamics,  described  by  an 
ordinary  differential  equation,  completely  decoupled  from  all  other  agents.  Specifically,  for 
i  =  1  ,...,Na  agents,  the  state  and  control  of  agent  i  are  Zi(t)  =  (qi(t) ,  qi(t))  G  M2n  and 
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Ui(t)  G  Mn,  respectively,  and  the  dynamics  are  given  by 

Zi(t)  =  AiZi(t)  +  BiUi(t),  t>  0,  Zi{ 0)  given 

where  A*  =  ,  B{  =  ° 

u  u  j  L  -'(«) 

The  matrix  /(„)  is  the  identity  matrix  of  dimension  n.  Each  agent  i  is  also  subject  to  the 
input  and  state  constraints 

Ui(t)  G  U,  Zi(t )  G  Z,  t  >  0. 

An  admissible  control  is  any  piecewise,  right-continuous  function  «*(•)  :  [0,  T]  — >  W,  for  any 
T  >  0,  such  that  given  an  initial  state  Zi( 0)  G  Z,  the  control  generates  the  state  trajectory 
Zi(t ;  Zi( 0))  G  Z  for  all  t  G  [0,  T\. 

The  set  ZN  is  the  A- times  Cartesian  product  Z  x  •  •  •  x  Z.  Concatenating  the  states 
and  inputs  into  vectors  as  q  —  (qi, ...,  (pvj,  Q  =  {Qi,  •  ••,  QNa),  z  =  (zi ,..>,  zNa)  G  ZNa  and 
u  =  (mi,  ...,  UNa)  G  UNa,  the  dynamics  are  equivalently 

z{t )  =  Az(t)  +  Bu(t),  t  >  0,  2(0)  given,  (1) 

where  A  =  diag(Ai,  ...,ANa),  B  =  diag(i?i, . . . ,  BNa ) .  Define  the  invertible  map  U  :  R2nNa  — > 
R2nNa  as 


Note  that  U  is  a  unitary  matrix,  so  ( U)TU  =  I. 

Definition  1.  The  control  objective  is  to  cooperatively  asymptotically  stabilize  all  agents 
to  zc  =  (zi, ...,  z%a)  G  ZNa,  an  equilibrium  point  of  equation  (1),  with  equilibrium  control 
equal  to  zero. 

The  cooperation  is  achieved  by  the  minimization  of  the  cost  function  defined  below.  The 
control  objective  for  each  agent  i  is  thus  to  stabilize  to  zf  while  cooperating  with  neighboring 
agents.  The  position  values  at  zc  are  denoted  qc  =  (gj, ...,  q%a),  and  the  equilibrium  velocity 
is  clearly  zero. 

Assumption  1.  The  following  holds: 

(i)  U  C  Mn  is  compact,  convex  and  contains  the  origin  in  its  interior,  and  Z  C  M2n  is 
convex,  connected  and  contains  zj  in  its  interior,  for  every  i  =  1, ...,  Na\ 

(ii)  each  agent  i  can  measure  the  full  state  z^,  there  is  no  uncertainty,  and  computational 
time  is  negligible  compared  to  the  evolution  of  the  closed-loop  dynamics. 

Remark  1.  In  the  absence  of  constraints,  linear  quadratic  optimal  control  could  be  used  to 
meet  the  cooperative  control  objective.  Convexity  of  U  is  related  to  existence  of  solutions  to 
the  optimization  problem  that  will  be  defined,  referring  the  reader  to  section  4.3  of  [5]  and 
references  therein  for  details.  Convexity  of  both  U  and  Z  is  relevant  in  guaranteeing  that 
the  closed-loop  system  will  have  nominal  robustness  properties  [13]. 
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The  multiple  vehicle  formation  is  here  defined  by  a  set  of  relative  vectors  that  connect 
the  desired  locations  of  the  vehicles.  The  desired  formation  can  in  turn  be  viewed  as  a 
graph,  as  in  [11,  25].  For  example,  consider  a  desired  formation  of  vehicles  in  Figure  1, 
where  the  position  components  of  qi  are  denoted  G  M2.  The  left  figure  shows  the 


Figure  1:  Seven  vehicle  formation:  vector  structure  on  the  left,  and  resulting  formation  on 
the  right. 

vector  structure  associated  with  the  formation.  The  numbers  correspond  to  vehicle  identity 
and  a  line  segment  between  two  numbers  is  a  two  dimensional  relative  vector.  The  dot  in 
the  center  of  the  figure  is  the  center  of  geometry  of  vehicles  1,  2  and  3.  Given  a  desired 
location  for  this  center  of  geometry,  and  the  relative  vectors  between  vehicles  as  shown,  this 
formation  designates  a  globally  unique  location  for  each  of  the  vehicles.  To  generalize,  a 
formation  of  Na  vehicles  is  uniquely  defined  given  Na  —  1  relative  vectors,  such  that  each 
vehicle  is  at  one  end  of  at  least  one  vector,  and  1  vector  (denoted  qd )  designating  a  desired 
center  of  geometry  location  for  a  subset  of  the  vehicles.  The  vehicles  used  to  relate  to  qd  are 
called  the  core  vehicles ,  consistent  with  the  definition  in  [23].  The  figure  on  the  right  shows 
the  associated  vehicle  formation,  where  the  core  vehicles  are  denoted  by  white  triangles,  and 
all  other  vehicles  are  denoted  by  black  triangles.  The  tracking  objective  is  being  achieved 
for  some  formation  path  M  3  1 1— >  ( qd{t ),  qd{t ))  G  M4. 

We  now  generalize  the  description  of  the  formation  as  a  graph.  The  vector  formation 
graph  is  defined  as  Q  =  (V,  £),  where  V  =  {1,  2, ...,  Na}  is  the  set  of  vehicles  and  £  C  V  x  V 
is  the  set  of  relative  vectors  between  vehicles,  where  an  edge  in  the  graph  is  an  ordered 
pair  (i,  j)  G  £  for  every  relative  vector  between  vehicles  i,  j  G  V.  All  graphs  considered 
are  undirected,  so  ( i,j )  G  £  =>  ( j,i )  G  £.  Two  vehicles  i  and  j  are  called  neighbors  if 
(i,  j)  G  £.  The  set  of  neighbors  of  agent  i  is  denoted  Mi  C  V.  In  addition,  although  any  two 
core  vehicles  may  not  have  a  relative  vector  between  them  in  the  formation,  we  consider  all 
core  vehicles  to  be  neighbors  of  one  another,  as  they  are  all  coupled  through  the  tracking 
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objective. 

Let  So  denote  an  orientation  of  the  set  S,  where  So  C  S  contains  one  and  only  one  of 
the  two  permutations  of  (i,j),  for  all  (i,j)  G  S.  Also,  without  loss  of  generality,  we  take  the 
core  vehicles  to  be  1,  2  and  3,  which  is  the  case  in  Figure  1.  As  stated,  although  (2,  3)  ^  S , 
3  G  Af-2  and  2  G  J\f3  since  2  and  3  are  core  vehicles  and  thus  coupled  through  the  tracking 
objective. 

Assumption.  The  undirected  vector  formation  graph  Q  is  connected. 

If  the  formation  graph  is  not  connected,  there  exists  a  vehicle  whose  desired  location  is  not 
uniquely  specified  by  the  graph,  in  addition,  the  cost  function  that  we  define  would  additively 
separate  into  more  that  one  coupled  cost  function.  For  a  connected  graph  and  resulting  cost 
function  defined  below,  the  centralized  receding  horizon  control  law  involves  a  single  coupled 
optimal  control  problem  that  will  be  given  in  the  next  section.  The  connectivity  assumption 
clearly  holds  for  the  example  in  Figure  1. 

Remark  2.  When  the  minimal  number  of  relative  vectors  are  used  to  define  the  formation, 

| So  |  —  Na  —  1 .  It  may  be  that  more  vectors  are  added  to  the  formation  description,  provided 
they  are  consistent  with  the  existing  vectors,  as  described  below.  Generally,  we  shall  denote 
\So\  —  M,  M  >  Na  —  1.  In  graph  theory,  assuming  the  graph  is  connected  implies  M  >  Na  —  1 

[4]- 


Let  e\ ,  ...,  eM  denote  an  ordering  of  the  elements  of  S0 ■  Also,  the  tail  of  the  edge  e;, 
denoted  t{ef),  is  the  first  element  in  the  corresponding  ordered  pair  and  the  head  of  the 
vector  h(ei)  is  the  second  element.  In  the  case  of  Figure  1,  let 

£o  =  {ei,  e2,  e3,  e4,  e5,  e6}  =  {(1, 2),  (1,  3),  (1,  6),  (1,  7),  (3, 4),  (5,  6)}. 

For  example,  we  have  t(e3)  =  1  and  h(e 3)  =  6. 

Definition  2.  The  desired  relative  vector  between  any  two  neighbors  i  and  j  is  denoted 
dij  G  Mn,  where  it  is  understood  that  qf  +  d^  =  qf  All  desired  relative  vectors  are  constant 
vectors,  in  length  and  orientation,  and  satisfy  the  following  consistency  conditions: 

•  For  all  ( i,j )  G  S ,  dl3  =  —dji. 

•  When  M  >  Na  —  1,  if  {i,  j),  (j,  l )  and  [i,  /)  are  in  S,  then  dl}  +  dji  =  du- 

Definition  3.  Given  an  admissible  oriented  formation  graph  Q  =  (  V.  So)  and  a  formation 
path,  the  formation  vector  F  =  (/i, ...,  /m+i)  £  ]Rn<W+1)  has  components  fi  G  M”  defined  as 

fi  =  Qi~  qj  +  d^,  where  i  =  t{ef),  j  =  h(et),  V/  =  1, ...,  M, 

fM+i  =  qs-qd,  qs  =  \(qi  +  q2  +  q3). 
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For  a  stabilization  objective,  <jd(t )  =  0,  Vt  G  M,  and  to  be  compatible  with  the  control 
objective,  qq  =  ( qf  +  +  <73)/ 3.  Clearly,  the  vehicles  are  in  formation  when  F  =  0.  Write 

the  linear  mapping  from  q  to  F  as: 

F  =  Gq  +  d,  GT=[C{n)V].  (2) 


The  vector  d  =  (...,djj, ...,  —  gd)  has  the  ordering  of  the  vectors  consistent  with  the  def¬ 
inition  of  F.  The  matrix  VT  —  [Vj  ...  V/vJ  G  Mnxn7Va  has  elements  Vj  G  Mnxn  defined 
as 

y  ={  if  i  =  1,  2,  3 

[0,  otherwise, 

The  matrix  CVn)  G  is  related  to  the  incidence  matrix  C  G  WLNaXM,  where  the 

elements  of  C  —  [cy]  are  defined  in  terms  of  the  elements  of  the  oriented  edge  set  Sq  as 

{+1,  vertex  i  =  t{ef) 

—  1,  vertex  i  =  h(ej )  . 

0,  otherwise 


The  matrix  C(n)  is  defined  by  replacing  each  element  of  C  with  that  element  multiplied  by 
Itny  The  incidence  matrix  for  the  example  in  Figure  1  is 


C 


11110  0 
-1  0  0  0  0  0 

0-10010 
0000-10 
0  0  0  0  0  1 

0  0  -1  0  0  -1 

0  0  0  -1  0  0 


In  defining  the  cost  function  for  the  optimal  control  problem,  the  following  proposition 
is  useful. 


Proposition  1.  The  matrix  G  in  equation  (2)  has  full  column  rank,  equal  to  dirn(g)  =  nNa. 

Proof.  Since  the  vector  formation  graph  is  connected,  the  incidence  matrix  C  has  rank 
(Na  —  1)  [4].  Scaling  the  entries  of  C  with  the  identity  matrix  J(n)  implies  the  rank  of  C(n) 
is  equal  to  n(Na  —  1).  The  matrix  V  in  equation  (2)  has  full  column  rank  equal  to  n,  and 
each  column  vector  is  linearly  independent  from  the  column  vectors  in  Cyy  Thus,  G  has 
rank  nNa,  which  is  the  column  dimension  of  the  matrix.  ■ 


Remark  3.  For  the  rest  of  the  paper,  we  shall  use  the  norm  ||^||  to  denote  the  Euclidean 
norm  of  any  vector  z  G  Mm.  In  cases  where  z  is  a  curve,  we  abuse  the  notation  ||z||  to  mean 
||z(t)||  at  some  instant  of  time  t. 
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From  the  definition  of  the  formation  vector,  we  know  that  Gqc  =  —  d  and  so 

\\F\\2  =  (G(q-qc))TG(q-qc)  =  \\q-qc\\2GTG- 

In  the  following  we  penalize  ||F||2  and  j|g||2  in  the  centralized  cost  function.  To  define  the 
objective  in  terms  of  the  state  z,  we  have  that 


G(q  -  qc ) 


=  GU  (z  —  zc),  where  G 


G  0 

0  hnNa)  _ 


As  a  result,  we  have 


^H2  +  ll9l|2 


2 

GTG ' 


In  the  next  section,  the  optimal  control  problem  associated  with  the  multiple  vehicle  forma¬ 
tion  stabilization  objective  is  defined  for  receding  horizon  control. 


3  Receding  Horizon  Control 

In  this  section,  we  give  the  receding  horizon  control  law  that  achieves  the  cooperative  control 
objective,  implemented  in  a  centralized  fashion.  The  centralized  integrated  cost  function  of 
interest  is 

L(z,u)  =  v\\qi-Qj  +  <kj\\2'  +  w\\qv-qd\\2  +  i'\\q\\2  +  v\\u\\2, 

(j,i)e  So 

with  positive  weighting  constants  u,  v  and  [i.  We  refer  to  the  term uj\\q^  —  qa\ \ 2  as  the  tracking 
cost ,  although  we  are  concerned  with  stabilization. 

Remark  4.  For  collision  avoidance,  an  appropriate  cost  function  between  any  two  agents 
is  defined  in  [23].  Alternatively,  to  guarantee  avoidance,  collision  avoidance  can  be  cast 
as  a  constraint,  as  in  [26].  We  do  not  incorporate  any  type  of  collision  avoidance  in  this 
paper,  although  coupling  constraints  between  neighboring  agents  will  be  discussed  in  the 
conclusions. 

From  the  previous  section,  we  also  have  that 

L(z,u)  =  \\z-  zc\\2Q  +  n\\u\\2,  Q=  uG  G  °  .  (3) 

L  U  Ul  (nNa)  _ 

From  Proposition  1,  Q  is  positive  definite  and  clearly  symmetric.  At  any  time  t,  given  z{t) 
and  fixed  horizon  time  T,  the  centralized  open-loop  optimal  control  problem  is 

Problem  1.  Find 

J*{z(t),  T)  =  min  J(z(t),u(-),T), 

u(-) 
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with 


ft+T 


J (z(t),u(-),  T)  = 
subject  to 


where  P  =  PT  >  0  and 


\z(r;z(t))  -  zc\\Q  + /j]\u(t)\\'  dr  +  \\z(t  +  T;z(t))  -  zc\\P, 


i(s)  =  Az(s)  +  Bu(s) 
u(s)  G  UNa 
z(s;  z(t ))  G  ZNa 
z(t  +  T;  z(t ))  G  r2(a;). 


s  G  [t,  t  +  T], 


(4) 


D(a):={zGKM“  :  \\z-zc\\'2p  <  a,  a  >  0} 


The  equation  (4)  is  called  the  terminal  constraint ,  as  it  is  a  constraint  enforced  only  at 
the  terminal  or  end  time.  Let  the  first  optimal  control  problem  be  initialized  at  some  time 
lo  G  K  and  let  S  denote  the  receding  horizon  update  period.  The  closed-loop  system,  for 
which  stability  is  to  be  guaranteed,  is 


i(r)  =  Az(t)  +  Bu*cent(r),  r  >  t0, 


(5) 


where  the  centralized  receding  horizon  control  law  is 

<ent(T)  =  <ent(T;  *(*))>  T  E  [t,t  +  S\,  0  <  5  <  T, 

and  u*ent(s;  z(t  j),  s  G  [t,t  +  T\,  is  the  optimal  open-loop  solution  (assumed  to  exist)  to 
Problem  1  with  initial  state  z(t).  The  receding  horizon  control  law  is  defined  for  all  t  >  t0  by 
applying  the  open-loop  optimal  solution  until  each  new  initial  state  update  z(t)  *—  z(t  +  6) 
is  available.  This  is  what  we  mean  when  we  say  a  controller  is  implemented  in  a  “receding 
horizon  fashion'7,  since  the  optimization  horizon  is  always  T  seconds  ahead  of  each  new 
update  time.  The  reason  to  use  a  sampling  period  shorter  than  the  open-loop  horizon  time 
is  that,  practically,  there  is  uncertainty  and  applying  only  a  fraction  of  the  open-loop  control 
before  re-sampling  and  recomputing  mitigates  the  effects  of  uncertainty.  The  notation  above 
shows  the  implicit  dependence  of  the  optimal  open-loop  control  u*e nt(-)  on  the  initial  state 
z(t)  through  the  optimal  control  problem.  The  optimal  open-loop  state  trajectory  is  denoted 
^ceiit(r;  z(t))-  Since  Problem  1  is  time- invariant,  we  can  set  t  =  0  and  solve  the  optimal  control 
problem  at  each  initial  state  update  over  the  time  interval  [0,  T\. 

Proof  of  asymptotic  stability  of  the  closed-loop  dynamics  under  the  receding  horizon 
control  implementation  can  be  established  by  taking  the  optimal  cost  function  </*(•)  as  a 
Lyapunov  function. 

Definition  4.  A  feasible  control  is  any  admissible  control  such  that  all  state  constraints  in 
Problem  1  are  satisfied  and  the  optimal  cost  function  is  bounded.  Let  Z  denote  the  set  of 
states  for  which  there  exists  a  feasible  control. 
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Assumption  2.  The  following  conditions  are  satisfied: 


(i)  the  largest  constant  a  >  0  in  the  terminal  constraint  (4)  is  chosen  such  that  12(a)  C 
ZNa  and  the  linear  state  feedback  u  =  K(z  —  zc)  and  the  positive-definite,  symmetric 
terminal  cost  P  satisfy 

(A  +  BK)tP  +  P(A  +  BK)  =  -(Q  +  hKtK) 

K(z  —  zc)  G  UNa,  V^G  12(a); 

(ii)  the  optimal  solution  to  Problem  1  exists  and  is  numerically  obtainable  for  all  z  G  Z. 

Theorem  1.  [6,  Theorem  1]  Under  Assumptions  1  and  2,  for  any  5  G  (0,  T\,  zc  is  an 
asymptotically  stable  equilibrium  point  of  the  closed-loop  system  (5)  with  region  of  attraction 
Z. 


The  stability  result  in  [6]  only  requires  that  Problem  1  be  feasible  at  initialization,  rather 
than  requiring  the  optimal  solution  at  each  update.  Also,  5  is  required  to  be  sufficiently 
small  since  the  authors  consider  quantization  errors  in  the  numerical  implementation  of  the 
receding  horizon  control  law. 

So  far,  we  have  detailed  the  conditions  required  for  asymptotic  stability  of  the  centralized 
receding  horizon  control  law.  In  the  next  section,  Na  optimal  control  problems  are  defined 
for  a  distributed  receding  horizon  implementation.  It  will  be  proven  that,  for  sufficiently 
fast  updates  (small  5),  the  distributed  receding  horizon  control  laws  are  asymptotically 
stabilizing. 

4  Distributed  Receding  Horizon  Control 

In  this  section,  a  distributed  receding  horizon  control  law  is  defined.  We  first  introduce 
some  useful  notation  and  define  Na  separate  optimal  control  problems,  that  are  solved  and 
implemented  in  a  distributed  receding  horizon  fashion.  Next,  we  analyze  the  stability  of  the 
closed-loop  system.  Finally,  we  comment  on  alternative  formulations. 

4.1  Distributed  Optimal  Control  Problems 

In  the  centralized  integrated  cost,  the  non-separable  terms  \  \qt  —  q3  +  dt]  1 1 2 ,  for  all  (i,j)  G  S0, 
as  well  as  the  tracking  term  \\qs  —  qd\\2,  couple  the  states  of  neighboring  agents.  Recall  that 
the  set  of  neighbors  of  each  agent  i  is  denoted  A f.  In  some  cases,  it  will  be  easier  to  denote 
the  set  of  neighbors  A/)  as  —i  and  both  shall  be  used  interchangeably. 

Let  Z-i  =  (zjj, ...,  Zjiu  i)  denote  the  vector  of  states  of  the  neighbors  of  i,  i.e.,  jk  G  A 4,  k  = 
1, ...,  | A/"; |,  where  the  ordering  of  the  states  is  arbitrary  but  fixed.  Also,  let  =  (q^, ...,  q^H.^) 
and  =  {un ,  where  the  ordering  is  consistent  with  z_j. 
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Definition  5.  The  distributed  integrated  cost  in  the  optimal  control  problem  for  any  agent 
i  —  1, Na  is  defined  as 


Li (^j,  %—i i  Ui)  Tj  (-Zj,  Z—i )  +  7/x|  | Ui  1 1  , 


where  Lzi(zi,z_i)=  7 


b'eM 


7>1,  TdC0  =  <!  o3'1  E 


%\\qx-Qd\\2,  *  =  1,2,3 


otherwise. 


Thus,  =  7 L(Z,U)  =  7  [|| 2  -  2TC|  |q  +yU||M||2]. 

From  the  definition,  the  distributed  integrated  cost  for  any  agent  i  includes:  one-half  of 
each  relative  vector  penalty  coupling  i  with  each  neighbor  j  e  J\ft,  the  velocity  and  control 
penalties  for  i,  and,  if  i  is  one  of  the  three  core  vehicles  (1,  2  or  3),  one-third  of  the  tracking 
penalty.  In  addition,  all  terms  are  multiplied  by  a  common  factor  7,  a  constant  greater  than 
one.  In  the  proof  of  stability,  the  key  structure  is  that  the  sum  of  the  distributed  integrated 
costs  equals  the  centralized  cost  multiplied  by  7.  For  any  problem  where  the  centralized  cost 
can  be  decomposed  in  the  same  way,  and  the  other  stated  assumptions  hold,  the  stability 
results  that  follow  are  applicable. 

Remark  5.  The  stability  results  that  follow  do  not  depend  on  equal  weighting  of  terms 
between  neighboring  agents.  What  is  required  is  that  the  distributed  integrated  costs  sum 
up  to  be  the  centralized  cost,  multiplied  by  a  factor  (7)  greater  than  one.  The  weighting  will 
of  course  affect  the  performance  of  the  closed-loop  system,  so  making  the  weights  lop-sided 
would  result  in  one  agent  reacting  more  to  the  term  than  the  corresponding  neighbor.  Note 
that  in  the  limit  that  one  agent  takes  the  entire  term,  while  the  other  ignores  the  term,  we 
have  a  leader-follower  effect. 


At  each  update  of  the  distributed  receding  horizon  control  laws,  every  agent 

•  senses  its  own  current  state  and  senses  or  receives  the  current  state  of  its  neighbors, 
and 

•  computes  the  optimal  control  trajectory,  comparing  it  to  an  assumed  control  trajectory 
and  based  on  some  assumed  control  trajectories  for  its  neighbors. 

Prior  to  the  next  receding  horizon  update,  every  agent 

•  implements  the  current  optimal  control  trajectory, 

•  computes  the  next  assumed  control  trajectory,  to  be  used  at  the  next  update, 

•  transmits  the  assumed  trajectory  to  all  of  its  neighbors  and  receives  the  assumed  control 
trajectories  from  each  neighbor. 
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Implicit  in  the  procedure  above  is  that  the  assumed  control  for  each  agent  i  is  consistent  in 
every  optimization  problem  that  it  occurs,  i.e.,  in  the  optimal  control  problem  for  agent  % 
and  for  each  neighbor  j  £  J\ft.  Before  defining  the  computation  for  the  optimal  and  assumed 
control  trajectories,  we  introduce  some  notation. 

Definition  6.  Consider  a  common  interval  of  time  [t,  t  +  T]  in  the  optimal  control  problem 
for  every  agent  i  =  1,  ...,Na.  Associated  with  the  initial  state  Zi(t),  we  denote: 

applied  control  «*(•;  Zj(t))  the  control  being  optimized  in  the  problem  and  applied  to 

the  system  over  the  subinterval  [t,t  +  5]; 

assumed  control  Ui(-’,Zi(t))  the  control  to  which  the  optimized  control  is  compared  and 

which  all  neighbors  assume  i  is  employing  over  the  interval. 

The  state  trajectories  corresponding  to  the  applied  and  assumed  controls  are  denoted 
Zi(-;.Zi(t))  and  Z{(-; zt(tj),  respectively.  For  each  agent  i,  given  the  current  state  Zj(t)  and 
assumed  control  Uj(s;  Zj(t)),  s  £  [t,t  +  T ],  of  every  neighbor  j  £  A/j,  the  assumed  state 
trajectory  Zj(s;  Zj(t)),  s  £  [t,t  +  T],  is  computed  using  the  dynamic  model  for  that  agent. 
An  important  point  is  that  the  initial  condition  of  every  assumed  state  trajectory  is  equal 
to  the  actual  state  value  of  the  corresponding  agent  at  that  time,  that  is 


Zi(t]Zi(t))  =  Zi(t) 


for  every  i  =  1, ...,  Na.  To  be  consistent  with  the  notation  z_j,  let  £_*(•;  Z-i(t))  and  «_*(•;  Z-i(t)) 
be  the  vector  of  assumed  neighbor  states  and  controls,  respectively,  of  agent  i.  With  consis¬ 
tent  initial  conditions  then  we  also  have  that  Z-i(t]  Z-i(t))  =  Z-i(t). 

The  distributed  optimal  control  problems  are  now  defined.  Denote  the  receding  horizon 
update  times  as  tf.  =  t0  +  5k,  where  k  £  N  =  {0, 1,  2, ...}.  Common  to  each  problem,  we  are 
given  the  constant  7  £  (l,oo)  from  Definition  5,  update  period  5  £  (0 ,T)  and  fixed  horizon 
time  T.  Conditions  will  be  placed  on  the  update  period  5  in  the  next  section  to  guarantee 
stability  of  the  closed-loop  system.  The  collection  of  distributed  open-loop  optimal  control 
problems  is 

Problem  2.  For  every  agent  i  =  1, ...,  Na  and  at  any  update  time  tk,  given  Zi(tk),  z-i(tk ;  Z-i(tk )) 
z-i(tk),  and  Ui(s;  Zifa))  and  U-i(s;  z-i(tk))  for  all  s  £  [tk,tk  +  T ],  find 


i^Ziitk) ,  Z—i(tk) i'-T')  min  Ji(zi(tk),Z-i(tk),Ui(-,  Zi(tk)),T), 

««(•) 


where  Ji(zi(tk),  Z-i(tk),Ui(-;  Zi(tk)),T)  is  equal  to 

ftk+T 

Jtk 


Li(zi(r;  Zi(tk)),  z^(r;  z^i(tk)),Ui(r;  Zi(tk)))  dr  +  'y\\zi(tk  +  T;zi(tk))-"  z?\\2P  t 
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subject  to 

4(A  -2j(4))  Aj,Zj(s,  ^i(4))  +  BiUi(s,  Zi(tk)) 

Zj(s,  Zj (tkj)  A  j Zj ( s ,  Zj(tk))  “I-  BjUj(s,  44)),  Vj  G  A/) 

Mi(s;  Zi(tk))  eU  s  e  [4, 4  +  T], 

Zi(s;Zi(t))  G  Z 

||«i(s;^(4))  -Wi(s;2i(4))tj  <  £4 

Zi(tk  +  T;  Zi(tk))  G  Oj(£j),  (7) 

given  positive  constant  k  G  (0,  00),  P,  =  Pj  >  0  and  where 

:=  e  M2n  :  ||z  -  z£||p.  <  £u  £i  >  0}  . 

As  part  of  the  optimal  control  problem,  the  applied  control  for  i  is  constrained  to  be 
at  most  a  distance  of  52n  from  the  assumed  control.  We  refer  to  this  constraint  as  the 
control  comparison  constraint.  Naturally,  the  constraint  is  a  means  of  enforcing  a  degree  of 
consistency  between  what  an  agent  is  actually  doing  and  what  neighbors  believe  that  agent 
is  doing.  The  assumed  control  for  each  agent,  as  well  as  each  terminal  cost  weighting  PtJ 
will  be  defined  below. 

Remark  6.  Instead  of  communicating  assumed  controls,  neighboring  agents  could  compute 
and  transmit  the  corresponding  assumed  state,  since  that  is  what  each  distributed  optimal 
control  problem  depends  upon.  That  would  remove  the  need  for  the  differential  equation  of 
each  neighbor,  simplifying  each  local  optimization  problem.  We  shall  discuss  this  further  in 
Section  4.3. 

The  optimal  solution  to  each  distributed  optimal  control  problem,  assumed  to  exist,  is 
denoted 

wdi(r;*i(*k))i  r  e  [4,4  +  t1]. 

The  closed-loop  system,  for  which  stability  is  to  be  guaranteed,  is 

z(t)  =  Az(t)  +  Bu*dist(r),  r>t0  (8) 

where  the  distributed  receding  horizon  control  law  is 

“dist  (r;z(tk))  =  (K1(pz1(tk)),...,u*dNa(T;zNa(tk))), 

for  r  G  [4,4  +  5],  0  <  5  <  T  and  k  G  N.  As  before,  the  receding  horizon  control  law  is 
updated  when  each  new  initial  state  update  z(tk)  -2(4+1)  is  available.  The  optimal  state 
for  agent  i  is  denoted  zdi  (■ T]Zi(tk )),  for  all  r  G  [4,4  +  T],  The  concatenated  vector  of  the 
distributed  optimal  states  is  denoted 

4st (Pzitk))  =  {z*di{.T]  z^h)), z*dNa(r]  zNa(tk))), 

for  all  r  G  [4,4  +  T\.  Although  we  denote  the  optimal  control  for  agent  i  as  u*di(r\ 44)) , 
it  is  understood  that  this  control  is  implicitly  dependent  on  the  initial  state  z*(4)  and  the 
initial  states  of  the  neighbors  Z-i{tk ). 


13 


Assumption  3.  The  following  conditions  are  satisfied,  for  every  i  =  1, Na: 

(i)  the  positive  constants  £*  >  0  are  chosen  such  that  Q  Z  and  such  that  an 

asymptotically  stabilizing  feedback  Ui  =  Kt(zt  —  zf)  and  positive-definite,  symmetric 
matrix  Pt  satisfy 

(Ai  +  BiKifPi  +  Pi(Ai  +  BiKi)  =  -{Qi  +  1-iKflQ 
Ki(zi  -  zf)  e  U.  Mzi  e  Qi(£i). 

Moreover,  Qt  is  chosen  such  that  Q  =  diag(Qi,  •••,  Qnu)  satisfies  Q  >  Q,  where  Q  is 
defined  in  equation  (3); 

(ii)  at  any  receding  horizon  update  time,  the  collection  of  open-loop  optimal  control  prob¬ 
lems  in  Problem  2  are  solved  globally  synchronously; 

(iii)  communication  of  control  trajectories  between  neighboring  agents  is  lossless. 

Remark  7.  The  receding  horizon  control  law  is  employed  for  all  time  after  the  initialization 
and  the  decoupled  linear  feedbacks  Kt  need  not  be  employed,  even  after  agent  i  enters 
The  value  for  £t  simply  determines  the  size  of  the  set  over  which  the  conditions  in  Assumption 
3  are  satisfied.  It  may  be  desired  that  an  agent  switch  from  the  receding  horizon  controller  to 
the  decoupled  linear  feedbacks  once  inside  the  terminal  constraint  set,  known  as  dual-mode 
receding  horizon  control  in  the  literature  [20].  Generally,  £t  could  be  chosen  to  satisfy  the 
conditions  in  Assumption  3  and  the  additional  condition 

D;(o)  P|^j(o)  =  for  a11  b-7  =  1  i  7 -  j, 

so  that  once  inside  any  agent  i  is  closer  to  its  objective  state  than  any  other  agents 

objective  state.  In  that  case,  the  agents  employ  the  decoupled  feedbacks  only  if  they  are 
close  enough  to  their  objective  state  and  far  enough  from  any  other  agents  objective  state. 

Remark  8.  Due  to  condition  (ii)  above,  the  distributed  receding  horizon  control  laws  are  not 
technically  decentralized,  since  a  globally  synchronous  implementation  requires  centralized 
clock  keeping  [3].  However,  a  locally  synchronous,  and  consequently  decentralized,  version 
is  also  currently  being  constructed  [9]. 

Although  not  employed,  the  decoupled  linear  feedbacks  K,  can  asymptotically  stabilize 
each  agent  to  its  objective  state  once  the  agent  enters  the  decoupled  terminal  constraint  set 
(7).  One  choice  for  Qi  that  would  satisfy  Q  >  Q  is  Qi  =  Amax(Q)/(2n),  where  Amax((5)  is  the 
maximum  eigenvalue  of  the  symmetric  matrix  Q.  Define 

K  =  diag( Ki KNa ) ,  P  =  diag(P, , .. . ,  PNa ) • 

As  a  consequence  of  assumption  (i)  above,  P  is  positive-definite  and  symmetric,  and  satisfies 

(a  +  BK^j T  p  +  p  (a  +  =  -  (q  +  jikTK^  <-[q  +  fikTk^j .  (io) 

We  now  define  the  initialization  procedure  for  the  distributed  receding  horizon  control  law, 
and  the  assumed  control  for  each  agent  at  each  update  time. 
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Definition  7.  (Initialization)  Denote  time  t_i  —  to  — 5.  Solve  Problem  2  with  initial  state 
z{t_ i),  setting  -Uj(r;  =  0  for  all  r  G  [t_i,t_i  +  T]  and  every  i  =  1  ,...,Na,  and  also 

setting  k  =  -foo.  The  optimal  trajectories  are  denoted  u*di{j]  Zi(t- 1))  and  zdi(r;zi(t- 1)),  for 
every  i  =  l,...,iVa.  The  optimal  control  udist(r]  z(t_i))  is  applied  for  r  G  [t_i,4>]- 

At  initialization,  the  control  comparison  constraint  is  effectively  removed  by  setting  hi 
to  a  large  number.  The  assumed  controls  at  initialization  will  have  an  impact  on  closed- 
loop  performance.  If  instead  the  centralized  problem  were  solved  at  time  t-i,  and  the 
solution  disseminated  to  the  agents,  the  closed-loop  performance  may  be  closer  to  that  of 
the  centralized  implementation.  In  the  simulation  results,  we  note  that  the  performance  is 
already  close  to  that  of  the  centralized  implementation  using  the  initialization  procedure  as 
defined  above. 


Definition  8.  (Assumed  Control)  For  each  agent  i  =  1  ,...,Na  and  for  any  k  G  N,  the 
assumed  control  «*( •;  Zi(tk ))  :  [tk,tk  +  T]  — > ►  U  is  defined  as  follows: 


if  z(tk)  =  zc ,  then  Uj(r;  Zi(tk))  =  0,  r  G  [4,4  +  T], 


otherwise  Ui(r;  Zi(tk )) 


U%{T\Zi{tk- 1)), 

Ki(zf{T]  4(4-i  +  T;  44-i))) 


t  G  [4, 4-i  +  T] 
t  £  [4-i  +  T,  4  +  T\  ’ 


where  z* (s;  4(0))  is  the  closed- loop  solution  to 


zf(s)  =  (Ai  +  BjQiz^s)  -  zf),  s  >  0, 


given  z*( 0). 


The  assumed  control  for  agent  i  at  initial  time  tk  is  generated  and  transmitted  to  each 
neighbor  j  G  A /)  in  the  time  window  [4-i,4]- 

To  state  Definition  8  in  words,  in  Problem  2  every  agent  is  assuming  all  neighbors  will 
continue  along  their  previous  optimal  path,  finishing  with  the  decoupled  linear  control  laws 
defined  in  Assumption  3,  unless  the  control  objective  is  met  at  any  update  time  after  ini¬ 
tialization.  In  the  latter  case,  neighbors  are  assumed  to  do  nothing,  i.e.,  apply  zero  control. 
Notice  that  the  communication  of  control  trajectories  between  neighboring  agents  is  not 
required  to  happen  instantaneously,  but  over  each  receding  horizon  update  time  interval. 

Remark  9.  The  test  of  whether  z(tk)  =  zc  in  generating  the  assumed  control  is  a  centralized 
test.  The  reason  for  the  test  is  its  use  in  the  proof  of  Proposition  2  in  the  next  section.  We 
note  that  the  asymptotic  stability  result  in  the  next  section  guarantees  that  only  in  the  limit 
as  4  — >  oo  do  we  have  z(tk)  — >  zc.  Practically  then,  one  could  assume  z(tk)  ^  zc ,  which  is 
true  for  any  finite  k  when  z(£_ i)  ^  zc ,  and  ignore  the  test  completely.  Also,  if  dual-mode 
receding  horizon  control  is  used,  the  test  can  be  removed,  since  Proposition  2  is  not  used  to 
prove  asymptotic  stability  in  that  case.  A  dual-mode  version  will  be  provided  in  the  next 
section. 


If  Z-i(t-i),T)  =  0  for  any  agent  i,  then  it  can  be  shown  that  Zi(t- 1)  =  zf  and 

Zj(t- 1)  =  Zj,  for  each  neighbor  j  G  A 4,  is  the  unique  feasible  solution,  i.e.,  the  local  objective 
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has  been  met.  However,  even  if  i),  z_,(t_i),  T)  =  0,  it  may  not  remain  zero  for  all  k  G 

N.  An  example  is  where  %  and  all  neighbors  j  G  Mi  are  initialized  meeting  their  objective,  but 
some  l  G  Mj  has  not  met  its  objective.  Thus,  in  the  subsequent  optimizations,  j  will  react  to 
/,  followed  by  i  reacting  to  j,  since  the  coupling  cost  terms  become  nontrivial.  Consequently, 
we  can  not  guarantee  that  each  distributed  optimal  value  function  Jf(zi(tk),  Z-i(tk),T)  will 
decrease  with  each  receding  horizon  update.  Instead,  we  show  in  the  next  section  that  the 
sum  of  the  distributed  optimal  value  functions  is  a  Lyapunov  function  that  does  decrease  at 
each  update,  enabling  a  proof  that  the  distributed  receding  horizon  control  laws  collectively 
meet  the  control  objective. 

4.2  Stability  Analysis 

We  now  proceed  with  analyzing  the  distributed  receding  horizon  control  laws.  At  any  time 
4,  k  G  N,  the  sum  of  the  optimal  distributed  value  functions  is  denoted  as 

Na 

J^(z(tk),T )  =  ^2j?(zi{tk),z-i(tk),T). 

1=1 

For  stability  of  the  distributed  receding  horizon  control  laws,  we  investigate  J^(z(tk),T)  as 
a  Lyapunov  function. 

Definition  9.  Problem  2  is  feasible  at  time  tk  if  for  every  i  =  1, ...,  Na,  there  exists  a  control 
Ui( •;  Ziitk))  :  [tk,  tk  +  T]  — >  U  such  that  all  the  constraints  are  satisfied  and  the  value  function 
Ji(zi(tk),  z-i(ifc),  «*(•),  T)  is  bounded.  Let  Z s  C  ZNa  denote  the  set  of  initial  states  for  which 
Problem  2  is  feasible  at  initialization  (time  t  =  f_i),  as  defined  in  Definition  7. 

Lemma  1.  Under  Assumptions  1  and  3,  Zv,  is  a  positively  invariant  set  with  respect  to 
the  closed-loop  system  (8)  setting  Zi(tk))  =  Ui(-',Zi(tk ))  for  every  i  =  1  ,...,Na  and  for 
k  G  N.  Thus,  feasibility  at  initialization  implies  subsequent  feasibility. 

The  proof  follows  immediately  from  Definitions  7  and  8.  Note  that  the  assumed  control 
iii  is  exactly  the  feasible  control  trajectory  used  in  Lemma  2  of  [6]  to  show  initial  feasibility 
implies  subsequent  feasibility  of  the  on-line  optimization  problem  in  the  centralized  case. 
Clearly,  zc  is  in  the  set  Zs. 

Remark  10.  Since  we  will  be  exploring  the  closed-loop  behavior  for  initial  states  that  start 
in  Zs,  we  can  immediately  infer  that  any  closed-loop  state  trajectory  will  remain  bounded. 
Specifically,  if  an  initial  state  can  be  driven  to  the  compact  terminal  constraint  set  in  finite 
time  using  bounded  control  ( U  is  compact),  then  the  optimal  trajectory  from  that  state  will 
remain  bounded.  In  the  bounding  argument  for  the  proof  of  stability,  we  will  make  use  of 
the  notation 

114(^4)  -<||  <  R,  for  any  r  G  [4,4  +  T],  keN,  for  all*  =  1, ...,  iV0.  (11) 

Moreover,  let  Umax  >  0  be  a  positive  scalar  denoting  the  maximum-norm  value  over  all 
feasible  controls  u(t)  G  U  at  any  time  t. 
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We  make  the  following  assumption. 


Assumption  4.  The  optimal  solution  to  Problem  2  exists  and  is  numerically  obtainable  for 
any  z(tk)  G  Z^. 

Given  the  additional  assumption  above,  we  have  the  following  result. 

Lemma  2.  Under  Assumptions  1,  3  and  4,  Z^  is  a  positively  invariant  set  with  respect  to 
the  closed-loop  system  (8).  Thus,  if  z(t- 1)  G  Z-%,  z*dist{r )  G  Z s  for  all  r  >  t- 1. 

The  next  result  says  that  the  net  objective  of  the  distributed  receding  horizon  control 
laws  is  consistent  with  the  control  objective. 

Proposition  2.  Under  Assumptions  1,  3  and  4,  for  a  given  fixed  horizon  time  T  >  0  and 
at  any  time  tk,  k  G  N, 

1.  J^(z{tk),T)  >  0  for  any  z{tk )  G  arid  J^(z(tk),T)  =  0  if  and  only  if  z(tk)  =  zc, 

2.  J^(z(tk),T)  is  continuous  at  z(tk )  =  zc. 

Proof  The  proposition  is  similar  to  Lemma  A.l  in  [5].  We  prove  item  1  here,  while  item  2 
follows  closely  along  the  lines  of  the  proof  in  the  lemma. 

1.  The  non-negativity  of  J^(z(tk),T)  follows  directly  from  L^Zi,  z_i,  uf)  >  0  and  Pi  >  0,  for 
all  i  —  1, ...,  Na.  It  remains  therefore  to  prove  that  equality  to  zero  is  equivalent  to  z(tk)  =  zc. 
(=4>)  J^{z(t),T)  =  0  implies  that  for  each  i  —  1,  ...,Na, 

Tl 1 4(4  +  T-,Zi(tk))  -  zf\\2p.  =  0  and 

rtk+T 

/  Li(zdi(r',  zi(tk )),  z-i(r;  z_i(tk)),u*di(r ;  ^(4)))  dr  =  0. 

Jtk 

Since  the  integrand  is  piece-wise  continuous  in  r  on  [tk,tk  +  T]  and  nonnegative,  we  have 
that 

Li(z*di(u  Zi(tk)),Z-i(T-,  Z-i(tk)),u*di(T-,  Zi(tk)))  =  0,  Vr  G  [tk,tk  +  T], 

and  for  every  i  =  1, ...,  Na.  From  Definition  5,  this  implies  u*di(r ;  Zi(tk ))  =  0  and  qdi(r ;  Zi(tk ))  = 
0,  for  all  r  G  [tk,tk  +  T ],  meeting  the  control  objective  for  the  velocity  for  every  i.  Fur¬ 
ther,  since  the  distributed  optimal  velocity  is  identically  zero,  qdi(r]qi(tk ))  =  qi(tk),  for  all 
r  G  [tk,tk  +  T]  and  every  i.  Finally,  from  the  distributed  terminal  costs,  we  have 

7l \zi(tk)  ~  zi\\2Pi  =  0,  for  every  i  =  1,  ...,Na. 

Since  every  P*  is  positive  definite  and  7  is  positive,  Zi(tk)  =  z f  for  every  i ,  which  is  equivalent 
to  z(tk )  =  zc.  We  must  also  guarantee  that  the  resulting  distributed  optimal  control  and 
state  are  feasible.  The  constraints  zdi  (r;  Zi(tk))  G  Z  and  u*di{j]  Zi(tk))  G  U  are  trivial,  since 
z\  and  0  are  in  the  interior  of  Z  and  U ,  respectively,  by  Assumption  1.  Also,  zf  G  hh(kj)  so 
the  terminal  constraint  is  satisfied.  Finally,  since  z(tk )  =  zc,  by  Definition  8,  4(77  Zi(tk ))  =  0 
for  all  r  G  [tk,tk  +  T]  and  every  i ,  so  the  control  comparison  constraint  is  also  satisfied. 
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(<*=)  Given  2(4)  =  zc,  by  Definition  8  we  have  that  for  each  i  —  1, Na,  lq(r ;  2*(4))  =  0 
for  all  r  G  [4, 4  +  T).  Consequently,  &(t;  ^(4))  =  9* (4)  =  0  and  ft(r;  2,(4))  =  ft(4)  =  ftc, 
for  all  r  G  [4,4  +  T],  and  every  i.  For  any  agent  i,  the  local  value  function  becomes 


uY  {|ll^(r;^(4))  -qi\\2} 


+  7  {v\\<ii(T-, zi(tk))\\2  +  n\\ui(r]  Zi(tk))\\2)  dr 
+  l\\zi(tk  +  T;  Zi(tk))  —  Zi\\2P., 


using  qc{  +  dij  =  q for  each  neighbor  j  G  A).  If  i  is  a  core  vehicle  (i.e.,  i  is  1,  2  or  3), 
we  also  have  an  addition  term  due  to  the  tracking  cost.  In  any  case,  given  2,(4 )  =  zfs  the 
optimal  and  feasible  control  for  each  open- loop  optimization  problem  is  u*di{j',  2^(4))  =  0 
for  all  r  G  [4,4  +  T1].  Therefore,  q*di(r]  Zi(tk))  =  0  and  qdi(r;  ft(4))  =  ft (4)  =  ftc,  for 
all  r  G  [4,4  +  T].  Since  this  holds  for  any  i,  every  J*(zi(tk),  Z-i(tk),T)  =  0  and  so 
J&z(tk),T)=0.  ■ 


The  condition  if  2(4)  =  2C  then  Ui(r;  Zi(tk))  =  0  in  Definition  8  was  used  in  showing 
the  equivalence  of  J^(z(tk),T)  =  0  and  2(4)  =  zc.  For  a  dual-mode  implementation, 
this  equivalence  is  not  required,  eliminating  the  need  for  this  condition  in  constructing  any 
assumed  control. 

Our  objective  is  to  show  the  distributed  receding  horizon  control  law  achieves  the  control 
objective  for  sufficiently  small  6.  We  begin  with  three  lemmas  that  are  used  to  bound  the 
Lyapunov  function  candidate  J^(z(tk),T).  The  first  lemma  gives  a  bounding  result  on  the 
decrease  in  J|.(-,T)  from  one  update  to  the  next. 

Lemma  3.  Under  Assumptions  1,  3  mid  4,  for  a  given  fixed  horizon  time  T  >  0,  there  exists 
a  positive  constant  £  >  0  such  that 


Na 


Jj;{z{tk+l ),  T)  Js(^(ffc),T)  < 


'tk 


^2Li(zdi(uzi(tk)):z-i(r;z_i(tk)))  dr  +  <f2£, 


i=  1 


for  any  5  G  (0,  T]  and  for  any  2(4 )  G  fcGh. 

Proof  The  sum  of  the  optimal  distributed  value  functions  for  a  given  2(4)  G  Z is 


rtk+T 

Jx(z(tk),  T)=  Y  L*(4(r;  Zi(tk)),z_i(T;  z-i(tk)),u*di(T ;  ^(4)))  dr 

Jtk  i=i 

+  7ll4st(^  +  T;  ^(4))  -^llp- 

Applying  the  optimal  control  for  some  5  G  (0,  T]  seconds,  we  are  now  at  time  4+i  =  4  +  <5, 
with  new  state  update  2(4+1)-  A  feasible  control  for  the  optimal  control  problem  of  each 
agent  over  the  new  time  interval  r  G  [4+1, 4+i  +  T\  is  u,(-;  2,(4+i))  =  ^i(s  2»(4+i)),  given 
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in  Definition  8.  Thus,  we  have  for  any  %  —  1, Na, 


J*(zi(tk+1),z_i(tk+1),T)  = 


rtk+i+T 


< 


'ifc+l 

rtk+i+T 
'tk+ 1 


4(44  +(4+ i)),  z-i{r;  z-i(tk+1)),u*di(T ;  ^(4+i)))  dr 

+  7ll4(4+i  +  T;Zi(tk+1))  -  z\ 
Li{z%{t ;  2*(4+i)), 4(t;  2_*(4+i)),  44;  2*(4+i)))  dr 

+  7|  144+1  +  T;Zi(tk+ 1))  -2; 
Summing  over  i,  we  can  upper  bound  </£  (2(4+1),  4),  and  comparing  to  J^(z(tk),  4 )  we  have 

f tk+S  Nc 

'  +  i=  1 


c||2 
R 


cm2 
i  II  Pi ' 


JZ(z(tk+1),T)-J£(z{tk),T)+  /  ^L^(4(r;^(4)),i_i(r;^_i(4)),4(r;^(4)))  dr 


< 


rtk+T  Na 

/  4(44 ;  2*(4+i)),  2_*(4+i)),  44;  ^(4+i)))  dr 

2=1 

-  4(44;  44)),  4*(r;  z_i(4)),  4(r;  2*  (4)))  dr 

■4+1  i=i 


+ 


rtk+i+T 

/  ^4(4(r;  244+1)),  4*4;  ^-*(4+i)),  44; +(4+i)))  dr 

>tk+T  j=1 


At* 


+  7^  Il4(4+i +  4;2*(4+i))  —  4 1 1  ly  -7ll4st(4  +  4-(4))  -2; 


cm2 

P‘ 


2=1 


Using  the  notation  £(•;  2)  =  (5i(-;  21), 40(s  7/va)),  we  observe  that 


At* 


^  ( 1 1  -2*  (4+1  +  -4  +  (4+1 ) )  4 1 1  Pi  1 1  ^(4+i  +  -4  ^  (4+1) )  2 


cm2 
P 


2=1 


and  also  that  5(4+4 ;  2(4+1))  =  44(4+4 ;  z{tk)).  From  the  definition  for  each  44;  2*  (4+1)) 
with  r  G  [4  +  44+i  +  4]  and  using  the  notation  in  equation  (10),  we  also  have 


Na 

Y  4(44;  4(4+1)),  z-»(t;  2_*(4+i)),44;+(4+i)))  =  7||i(r;  2(4+1))  -  2C||~ 
2=1 


where  Q  =  Q  +  /j,KtK.  Finally,  from  the  properties  of  the  feedback  in  Assumption  3  and 
the  notation  in  equation  (10),  we  have 


2(4+1  +  4;  2(4+1))  -zc\\2p-  4(4 +  4;  2(4+1))  -2C||I  =  - 


rtk+i+T 


'  tk~\~T 


1^4;  2(4+1))  -  4||4dr, 
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where  Q'  =  Q  +  jiKTK  and  Q'  >  Q.  Thus,  we  can  write 


rtk+i+T 

/  ^  Li{zi{r ;  Zi(tk+i)),z-i(T-,  Z-i(tk+1)),Ui(T ;  ^(tfc+i)))  dr 

>tk+T  i=1 


Na 


i=l 


+  7^  +  71;zi(4+1))  -^Hr,  ~7lkdist(4  +  r;z(4))  -2 

rtk+i+T 

'tk+T 


c  1 1 2 
P 


-7 


|i(r;z(4+i))  -  ^c||q,_q  dr  <  0. 


Now,  we  have 


Na 


< 


Li(z*di(r ;  **(**))»  Me  z-i(tk)),  u*u(t;  Zi(tk)))  dr 

j=l 

/■tfc+T  ^ 

I  ^2  Li(zi(T'i  Zi(tk+i)),Z-i(T]  Z-i(tk+1)),Ui(T ;  ^(4+i)))  dr 

i=l 

~  Y1  L*(4(r;  2i(ffc)),  ^-i(r;  *-»(**))>  <i(r;  Zi(tk)))  dr. 

^*fc+i  *=i 

By  definition,  each  Zi(r;  Zi(tk+1))  =  zdi(r;zi(tk ))  and  fq(r;  ^(tfc+i))  =  w^(r;  Zi(tk)),  over  the 
interval  r  €  [tk+i,tk  +  T],  Consequently,  we  have  from  Definition  5 

/  Lj{zi{r  \  Zj(tk+ 1)),  z_,(r;  z^fa+i)),  fq(r;  Zj(4+i)))  dr 

^fc+1  i=l 

~  Y1  L*(4(r;  ^(4)),  Me  z-i(tfc)),  <(r;  ^(4)))  dr 

^*fc+i  i= 1 

rtk+T  Na 

=  7/  EEfik  di(r5  ^(tfc))  -  q*dj(T- Zjitk))  +  dijW2 

Jtk+i  i=1  jeJ^. 


-  \  \q*di(T',  zi(tk))  -  qj(T]  qj(tk))  +  dij ||  f  dr 


+  7 


^{ll4(r;^(4))  +  4-(r;^-(ffe))  +  ^,(r;^(ffc))  -;3gd||s 

'tfc+1  (i,j,i)e£c 


~  \  \q*di(T',  zi(tk))  +  qj(r]  Zj(tk))  +  q^r;  zi(tk))  -3qd\\~  \  dr, 


where  £c  =  {(1,  2,  3),  (3, 1,  2),  (2,  3, 1)}.  For  the  remainder  of  the  proof,  we  are  interested 
in  finding  an  upper  bound  for  the  expression  on  the  right  hand  side  of  the  equality  above. 
Consider  the  terms  in  the  first  integral 

\\q*dAr',  zi(tk))  ~  qdj(r-,  zj(tk))  +dij\\2  -  \\qdi{T-  Zi(tk))  -  gj(r;ty(tfc))  +  c%||2, 
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over  the  interval  r  G  [4+1,  t^+T],  The  first  term  can  be  bounded  using  the  triangle  inequality 
as 


I  kdi(r;*i(tfc))  -  q*dj{T'i  Zj{tk))  +  di:i ||2 

=  1140;  ^(4))  -  q*dj{r\ 4(4))  ±  &0;  <?;(4))  +  41  I2 
<  1 140;  44))  -  qj(r;qj(tk))  +  dij\\2  +  1 40;  44))  -  40;  4  (4))  1 12 
+  2 1 1 4 0;  ^ (4) )  -  Qj  (t;  qj  (4) )  +  4  1 1  •  1 1 qj  (t;  qj  (4) )  -  q*dj  (t;  ^  (4) )  1 1  ■ 

With  this  upper  bound,  we  have  a  cancellation  with  the  negative  term  j|g^(r;  44))  — 
q-j(tkj)  +  4  1 1 2  in  the  integral.  Now,  considering  the  terms  in  the  second  integral, 
particularly  the  first  term,  we  can  use  that  same  bounding  argument  to  get 

1 140;  44))  +  40;  44))  +  40;  44))  —  3^rf||- 

=  1140;  44))  +  40;  44))  +  40?  44))  ±4t;44))  ±®0;44))  -3qd\\2 
<  1140;  44))  +  qj(T]  Zj(tk))  +  qi(r;  44))  -  3qd  |2 
+  2II40;44))  +  4+ 44))  +  ®0;44))  -3gd|| 

'  1 140;  44))  ~  Qj(T]  Zj(tk))  +  q#(T]  zi(tk))  -  qi{r;  zi{tk))\\ 

+  114444))  “4r;  44)) +  40;  44))  -®0;44))||“- 

Using  the  triangle  inequality  again  we  bound  the  term 


I  144  Zj{tk))  -  4r;  4(4))  +  40;  44))  -  ft  0;  44))  1 1 

<  114444))  -9j(r;2j(4))||  +  1140; 44))  -  ®0; 44))||- 

Substitution  back  into  the  integral  equations,  after  cancellations,  yields  an  upper  bound 
given  as 

/Vfc+T  Na 

7  EEfbik/  d4  44))  -  qj(r;  44))  +  41!  •  ||4t;  44))  -  44  44))|| 

Jtk+i 

+  H4r;44))  -  44  44)) 1 12}  dr 

rtk+T  _ ^  , 

+  7/  ™{2II4444))  +  qj(r;  Zj(tk))  +  qi(r;  zi(tk)) -3gd|| 

Jtk+1  (jj,i)e£c  Z 

'  (  Zj(tk))  ~  Qj(T]  Zj(tk))\\  +  II40;44))  -  ®0;44))||  ) 

+  ( 114444)) -4r;44))ll  +  II40;44)) -.gi0;44))ll  )2}  dr. 

Since  each  term  in  the  integral  is  non-negative,  we  can  upper  bound  this  expression  by 
extending  the  interval  of  integration  from  [4+1, 4  +  T]  to  [4,4  +  T\-  The  reason  for  doing 
this  is  that  for  any  j  —  1,  ...,Na,  z*dj  (tk',Zj(tk))  =  Zj(tk]Zj(tk)),  thereby  matching  the  initial 
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conditions  for  the  two  trajectories  we  are  comparing.  Now,  for  any  j,  we  can  bound  the  term 


II q*dj(T;  zj(tk))  -  qj{T\  Zj{tk)) II  <  /  II q*dj(s-,Zj(tk))  -  &(s;  Zj(tk))\  \  ds 


'ifc 


< 


u4j(V)Zj(tk))**Uj{v',Zj{tk))\\  dr]  ds 


tfc  «/  tfc 

<  [  f  k82  dr]  ds  =  ^/u52(r  —  tk)2 . 

Jtk  Jtk  2 

From  equation  (11),  and  using  the  triangle  inequality  and  qf  +  djj  =  we  can  also  bound 
the  term 


II Qdi(T;  Zi(tk)) 


qj{r;qj(tk))  +  dijW  <  \\q*di(r]  zt(tk))  -  q-\\  +  qj(tk))  -  q] 


<  2 R. 


Similarly,  using  q\  +  q%  +  q%  =  3 qd,  we  have 


lkdi(r;^(4))  +  qj(r-,zj(tk))  +  qi(r-,  zi(tk))  ~  3qd\\ 

<  \\q*di(j]  Zi(tk))  -qci  ||  +  II qj(T;qj(tk))  -^c||  +  || qi(r;qi(tk))  -  qf  ||  <  3 R. 

Note  that  Y^f=i  'YhjeN'i  1/2  =  M,  where  M  is  the  number  of  relative  vectors  used  to  define 
the  formation  objective  in  Section  2.  Substitution  in  the  integral  equations  yields  the  upper 
bound 


rtk+T 


7 


'tk 


L  *■■■■  1  jcA", 


VV  ^{2 Rk82{t  -  tk )2  +  k28a{t  -  tk)4 

^  ^^>Rk52(t  -tk)2  +  k28a(t  -tky 


( i,j,l)e£c 


dr 


rtk+T 


=  ycu 


'tk 


(6  M  +  2  )Rk82(t  —  tk)2  /  3  +  (5M/4  +  5/9  )n28A{r  —  tk)A  /  5  dr 


<  82^ujkTz{2M  +  2/3)  [31?  +  k82T2'  . 


For  the  last  term  in  the  brackets  to  be  a  constant  independent  of  <5,  we  can  bound  the 
expression  by  setting  8  —  T  inside  the  brackets,  since  <5  G  (0,  T].  Finally,  we  have 


Jh{z{tk+i)i  T)~Jj](z(tk),T) 

rtk+6  Na 


<  - 


< 


y^i^di^^ZiitkyjZ-i^z-ytky^ayT-^zytk)))  dr  +  82t 
i=1 

/  5^Li(4(r;^(4)),^-i(r;^(4)))  dr  +  82£, 

’tk  i= i 
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where 


£  =  ^ujkT3  (2M  +  2/3)  3-R  +  nT4  . 


(12) 


This  completes  the  proof.  ■ 

Ultimately,  we  want  to  show  that  decreases  from  one  update  to  the  next  along 

the  actual  closed-loop  trajectories.  The  next  two  lemmas  show  that,  for  sufficiently  small 
<5,  the  bounding  expression  above  can  be  bounded  by  a  negative-definite  function  of  the 
closed-loop  trajectories. 

Lemma  4.  Under  Assumptions  1,  3  and  4 ,  for  any  z(tk)  G  Z^,  feN,  such  that  at  least 
one  agent  i  satisfies  zfitfi)  z£,  and  for  any  positive  constant  £,  there  exists  a  8(z(tk))  >  0 
such  that 


rtk+S  iVa  rtk+S 

/  ^tf(z*(T-,Zi(tk)),z-i(T;z-i(tk)))  dr  -62£  >  "~*  ~u  "  ~c"2 

*=i 


'tk 


\4ist(u  z(tk))  -ZC\\2Q  dr, 


for  any  5  E  (0,  5(z(tfc))]  ■  If  z(tk)  =  zc ,  then  the  equation  above  holds  with  8(z(tk))  =  0. 
Proof.  At  time  r  =  tk,  z*di(tk\ zfitk))  =  zfitk )  and  z_i(r;  Z-fitk))  =  z-fitk ),  and  so 

Na  Ay 

^  (zdi(fk]  Zj(tk)),  Z-j(tk,  Z—j(tk )))  =  ^  z—i{tk))  =7|| z{fk)  ^  ||qj 


2=1 


2=1 


where  the  last  equality  is  from  Definition  5.  Since  Q  >  0,  7  >  1,  and  at  least  one  agent  i 
satisfies  zfitf.)  z£,  we  have  that  z(tk)  zc  and 


Na 


J2Mzdi(tk,zfitk)),z-fitk;z-fitk)))  >  ||^(tfc)  z  ||q  >  0. 


2=1 


Equivalently,  we  have 


Na 

^  y  izdi(^ky  ziifk))i  Z—j{tk,  z-iifk )))  >  ll^dist  ifkizifk)')~z  \\q-  (13) 

2=1 

Under  the  assumptions,  zdi(r;  zfitk)),  z_fis))  and  5_j(r;  z-i(tk))  are  absolutely  continuous  in 
r,  for  any  i  =  1, ...,  iV0.  Any  quadratic  function  of  z*di  and  Z-i,  including  each  L|,  is  therefore 
absolutely  continuous  in  r.  Thus,  for  any  given  £  >  0,  we  can  choose  a  5(z(tk))  >  0  such 
that  for  any  r  G  [tk,tk  +  8(z(tk))), 


Na 

£ 

2=1 


LK4i(u  Zi(tk)),  Z—i (r;  z_i(tk))) 


>  2(r  -  tk)£  +  \\z*dist(r;  z(tk)) -  zc 


2 
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and  equality  holds  when  t  =  tk  +  S(z(tk)).  That  is,  tk  +  <5(z(tk))  is  the  first  time  at  which 
the  two  sides  of  the  inequality  are  equal.  Choosing  such  a  S(z(tk))  >  0  is  possible  even  if 
each  Lf  is  a  decreasing  function  of  r  and  ||4st(r;  z(tk))  —  ^c||b  is  an  increasing  function  of  r, 
because  of  the  initial  margin  in  equation  (13)  and  also  because  the  functions  have  bounded 
rate.  The  latter  statement  follows  from  the  functions  being  quadratic,  therefore  the  gradients 
are  linear,  and  from  the  bounds  on  the  state  and  control  trajectories,  as  stated  in  Remark 
10.  By  integrating  both  sides  in  r  over  the  interval  [tk,tk  +  <5(z(£fc))],  the  inequality  still 
clearly  holds.  However,  a  larger  value  of  5(z(tk ))  can  be  obtained,  since  we  are  interested  in 
comparing  the  integral  equations  in  the  first  place,  rather  than  comparing  the  expressions 
over  all  values  of  r  6  [tk,  tk  +  5(z(ffc))].  Again,  even  assuming  in  the  worst  case  that  each  Lf 
is  a  decreasing  function  of  r,  while  assuming  | |zrjist(r;  z(tk))  —  zc \\q  is  an  increasing  function 
of  r,  we  still  have  that  for  any  given  £  >  0,  there  exists  a  S(z(tk))  >  0  such  that 


rtk+8(z(tk))  Na 

/  Zi(tk)),  z-i(tk)))  dr 

4  i=i 

rtk+S(z(tk)) 


2 (t-4)£  +  \\z*dist(T]  z(tk))  -  zc\\Q  dr 


J  tk 

ftk+S(z(tk)) 

5(z(tk))2£  +  /  f~~u  "  ~c"2 

Jtk 


l4st(^;-(4))  -z%  dr.  (14) 


Finally,  we  have 


T]  Zi(tk)),  z-i(r-,  z-i(tk)))  dr 


>  ^2£  + 


4st(r;-(4)) 


dr, 


for  any  <5  >  0  no  bigger  than  S(z(tk)),  i.e. ,  for  any  5  G  (0,  5(z(^))].  If  z(tk)  =  zc,  then  both 
integrands  are  identically  zero  (see  proof  of  Proposition  2).  As  a  result,  we  immediately  have 
6(z(tk))  =  0.  ■ 


Remark  11.  For  a  given  value  of  7  >  1,  as  ||z(£fc)  —  zc\\  decreases,  so  does  the  margin 
in  equation  (13).  Consequently,  as  the  states  approach  the  control  objective,  the  value  of 
5(z(tk))  that  satisfies  the  integral  equality  in  equation  (14)  decreases.  In  fact,  as  z(tk)  — >  zc, 
the  equation  requires  that  S(z(tk ))  — >  0.  This  corresponds  to  an  increasingly  strict  constraint 
on  the  deviation  of  the  optimized  control  from  the  assumed  control  as  j| z(tk)  —  zc\  \  decreases. 
It  also  requires  that  communication  of  control  trajectories  must  happen  at  an  increasing  rate, 
and  with  infinite  bandwidth  in  the  limit.  Later,  we  will  construct  an  update  time  that  is 
sufficiently  small  to  guarantee  that  all  agents  have  reached  their  terminal  constraint  sets  via 
the  distributed  receding  horizon  control,  making  it  safe  to  henceforth  apply  the  decoupled 
linear  feedbacks  (dual- mode).  The  sufficiently  small  value  for  the  update  period  then  serves 
as  an  upper  bound  on  how  small  the  value  of  5  must  be  for  dual-mode  distributed  receding 
horizon  control. 
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The  lemma  above  provides  a  test  on  the  update  period  that  later  is  used  to  guarantee 
distributed  receding  horizon  control  stability.  It  would  more  be  useful  to  have  an  analytic 
expression  for  the  test.  Such  an  expression  is  difficult  to  obtain,  since  the  trajectories  in  the 
integrals  in  equation  (14)  are  implicity  defined  and  therefore  hard  to  analyze.  However,  by 
making  an  assumption  that  approximates  and  simplifies  the  functions  in  the  integrals,  we 
are  able  to  obtain  an  analytic  bound  on  the  update  period. 

Assumption  5.  The  interval  of  integration  [tk,tk  +  5]  for  the  expressions  in  equation  (14) 
is  sufficiently  small  that  first-order  Taylor  series  approximations  of  the  integrands  is  a  valid 
approximation  for  any  z(tk)  G  Zs.  Specifically,  we  take 

Na 

J2Li(z*di(r',  Zi(tk)),  z-i(r;  z-i(tk ))) 

i= 1 

Na 

~  l\\z{tk)  -  zc\\q  +  (r  -  tk)  ^2  |  VZiLi(zi{tk),z-i(tk))T(AiZi(tk)  +  BiU*di(tk;  Zi(tk))) 

1=1 

T  Yjj(zj\ri'VZjLi  (zi(tk),  z-i(tk))  (yAjZj{tk)  T  BjUj(tk,  Zj (tkj ) )  j* 


and 

lkdfct(T;*(*fc))  ~  zCWq  ~  II z(tk)  -zc\\2q+  (t  -  tk)2(z(tk)  -  zc)TQ  (Az(tk)  +  Bu*dist(tk;  z(tk))) , 
and  ignore  terms  that  are  0((r  —  tk )2)  and  higher-order. 

Lemma  5.  Under  Assumptions  1  and  3-5,  for  any  z(tk )  G  k  G  N,  the  margin  in  Lemma 
4  is  attained  with 


d(z(tk))  = 


(7-  l)ll-(4)  -zc\\2q 


f  +{n-i)\ma x{Q)(R2  +  uiaxy 

given  the  state  and  control  bounds  R  and  Umax,  respectively. 

Proof.  From  equation  (14),  for  a  given  z(tk),  we  want  to  choose  a  5  such  that 

/  ^M(zM(T>zi(Q)>z-i(T'>z-i(tk)))  -  H4st(r;^(4))  —  ^c| |q  dr  -  52f  >0, 

dtk  i= 1 

with  equality  when  6  =  8(z(tk)).  Substitution  of  the  Taylor  series  expression  from  Assump¬ 
tion  5  into  the  integrals  results  in 


8(f-C)  <  (7-  l)\\z(tk)-  z%, 


(15) 
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where  C  combines  the  first  order  terms  in  the  expansions,  given  as 


C  = 


Na 


i=  1 


^  ^  i  Z— i{tk))  Qi{tk)  udi(tki  Zi{tk)) 


0  Z—i(tk ))  Qj(tk) 


uj(q(tk)  -  qcy  G  Gq(tk )  +  uq(tk)  u*dist(tk]  z(tk)) 


Note  that  the  dependence  on  Ui  disappears  since  each  Lt  is  not  a  function  of  q3  for  any 
neighbor  j  G  A/).  Since  all  VLj  inner-product  terms  are  consistent  with  the  state  z(tk),  we 
can  combine  the  sum  to  be 

Na  i 

{2  v«Li(^(4),^(4))T^(4)  + 

2=1 

=  7 a;(g(2fc)  -  qc)TGTGq(tk). 


Consequently,  we  can  simplify  C  as 

c1  =  (7  -  1)  K<?(4)  -  qc)T GT Gq(tk)  +  vq(tk)Tu*dist(tk;  z{tk))\ 

=  (7  -  l)(z(tfc)  -  zc)TQz*dist(tk-,z(tk)). 

Since  C  is  the  inner-product  of  (in  general)  different  vectors,  it  could  be  positive  or  negative. 
In  equation  (15),  (7  —  1)  >  0  is  given  and  £  >  0  is  given  and  we  are  looking  for  the  largest 
5  such  that  the  inequality  holds.  Note  that  if  C  >  £,  the  inequality  holds  for  any  positive  5. 
In  the  worst-case,  the  constant  C  will  be  a  negative  number,  removing  more  of  the  margin 
for  5.  More  precisely,  we  first  observe  that  for  any  two  vectors  x  and  y, 

xTy  >  -max{||x||2,||7/||2}. 

From  this,  and  using  the  bound  assumptions  in  equation  (11),  namely 

I \z(tk)  -  2C||2  <  R2,  and  ||idist(4;-(4))||-  <  R2  +  U2^, 

we  have 

(Q1/2(z(tk)  -  zc))T  (Q1/2z*dist{tk,z{tk)))  >  -Amax(Q)  {R2  +  f/2ax) . 

So,  in  the  worst  case,  we  have  that  5  must  satisfy 

<s[f  +  (7  -  1)A„„W)  (R2  +  ££J]  <  (7 -1)1144) -VII  l 

or,  equivalently, 

s  <  (7-  jjjkfa)  -*cIIq 

-£  +  (7  -  l)Amax(Q)  {K2  +  UU ' 
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The  inequality  above  becomes  an  equality  when  5  =  S(z(tk)),  giving  the  stated  result.  ■ 


Consistent  with  the  observation  in  Remark  11,  we  see  that  5(z(tk ))  — > ►  0  as  z(tk)  —■ >  zc. 
For  an  analytic  test  on  the  update  period  S,  we  can  combine  the  equation  for  S(z(tk))  in 
Lemma  5  and  equation  (12)  for  £,  to  obtain 


S(z(tk)) 


(7-1)11  z(tk)-zc 


12 
I Q 


i  +  7A„»«)  (R2  +  Ul  J 


(16) 


Note  that  equation  for  5(z(tk))  in  Lemma  5  is  made  slightly  more  conservative  here  by 
replacing  (7  —  1)  with  7  in  the  denominator  term.  This  is  done  only  to  facilitate  analysis 
that  follows. 


Remark  12.  The  upper  bound  on  the  update  period  in  equation  (16)  has  some  interesting 
features.  The  bound  is  relatively  independent  of  7,  particularly  if  7  3>  1.  The  larger  the 
allowable  bounds  on  the  state  and  control  (R  and  Umax),  and  the  larger  the  horizon  time  T, 
the  smaller  the  required  update  time.  With  regard  to  the  control  compatibility  constraint, 
i.e.  that  the  optimized  control  deviate  from  the  assumed  control  with  a  norm  value  at  most 
52n,  the  update  must  also  be  faster  for  larger  values  of  k.  Given  the  conservative  nature 
of  the  proofs,  it  is  not  wise  to  infer  too  much  from  the  bound  in  equation  (16),  but  it  is 
reassuring  to  observe  such  intuitive  affects.  Additional  comments  on  the  conservatism  of  the 
result  will  be  stated  at  the  end  of  this  section. 

We  also  note  that  since  S(z(tk))  depends  on  |j z(tk)  —  zc\\q,  a  centralized  computation  is 
required  to  generate  equation  (16)  at  each  receding  horizon  update.  Otherwise,  a  distributed 
consensus  algorithm,  as  given  in  [24]  for  example,  must  be  run  in  parallel  to  determine 
|| z(tk)  —  zc [I/,,  or  a  suitable  lower  bound  on  it.  In  the  dual-mode  version  defined  below,  no 
such  centralized  computation  is  required  on-line,  since  a  fixed  bound  on  the  update  period 
is  computed  off-line  and  applied  for  every  receding  horizon  update. 

The  first  main  theorem  of  the  paper  is  now  given. 

Theorem  2.  Under  Assumptions  1  and  3-5,  for  a  given  fixed  horizon  time  T  >  0  and  for 
any  state  z(t- 1)  G  at  initialization,  if  the  update  time  5  satisfies  6  G  (0, 8(z(tk))],  k  G  N, 
where  5(z(tk ))  is  defined  in  equation  (16),  then  zc  is  an  asymptotically  stable  equilibrium 
point  of  the  closed-loop  system  (8)  with  region  of  attraction  Z^,  an  open  and  connected  set. 

Proof  If  z(tk)  =  zc,  ufir;  Zi(tk ))  =  0  for  all  r  G  [tk,tk  +  T]  and  every  i  =  1, ...,  Na,  then  the 
optimal  solution  to  Problem  2  is  u^ist(r;  Zi(tk))  =  0  for  all  r  G  [tk,tk  +  T],  This  is  shown 
in  the  proof  of  Proposition  2.  Since  Azc  =  0,  zc  is  an  equilibrium  point  of  the  closed-loop 
system  (8). 

We  observe  that  J£(z(tk),T)  has  the  following  properties 

•  J^(zc,T)  =  0  and  J£(z(tk),T)  >  0  for  z(tk)  zc , 

•  J^(z(tk),T)  is  continuous  at  z{tk)  =  zc, 
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•  along  the  trajectory  of  the  closed-loop  system  from  z(t 0),  where  z(t0 )  =  ^ist(t0;  z{t_ i)) 
for  any  initial  state  z(t_ i)  G 

r  t"m 

Ji(z(tm),T)  -  Ji(z(tk),T)  <  -  /  H4sl(r)  -  z%  dr, 

Jtk 

for  integers  k,  m  with  0  <  k  <  m  <  oo. 

The  first  two  properties  follow  from  Proposition  2.  The  last  property  is  derived  as  follows. 
Combining  Lemma  4  and  Lemma  3,  we  have 

ftk+l 

Jx(z(tk+1),T)  -  J^(z(tk),T)  <  -  /  || 4st(c-(4))  —  zc\\q  dr. 

Jtk 

Applying  this  recursively  gives  the  result  for  any  tm  >  tk.  Following  precisely  the  steps  in 
the  proof  of  Theorem  1  in  [6]  from  this  point  gives  the  stated  result.  ■ 

From  equation  (16),  we  observe  that  5(z(tk))  — >  0  as  z(tk)  — > ►  zc.  As  a  consequence, 
the  control  comparison  constraint  gets  tighter,  and  the  communication  between  neighboring 
agents  must  happen  with  increasing  bandwidth,  as  the  agents  approach  their  control  objec¬ 
tive.  To  mitigate  these  problems,  we  now  propose  a  dual-mode  version  of  the  distributed 
receding  horizon  control  law.  The  closed-loop  system  will  be  equation  (8)  until  all  agents  are 
in  the  interior  of  their  terminal  constraint  sets,  at  which  point  each  control  is  synchronously 
redefined  to  be  the  decoupled  linear  feedback  defined  in  Assumption  3. 

To  construct  the  dual-mode  version,  we  will  make  use  of  the  monotonicity  of  J$(z(tk),  T) 
for  guaranteeing  invariance  properties  of  the  distributed  receding  horizon  control  law.  In 
particular,  under  the  conditions  of  Theorem  2,  J£(z(tk),T)  monotonically  decreases  until 
z(tk )  =  zc.  Therefore,  the  concatenated  state  z{tk)  is  contracting,  in  some  norm-sense,  at 
each  receding  horizon  update.  The  control  switch  must  therefore  rely  on  a  test  on  the  entire 
state  z(tk)  to  guarantee  all  agents  are  in  their  terminal  constraint  sets.  A  sufficient  test  is 
whether 

z(tk)  G  0(£min)  :=  j-  G  R2nNa  :  \\zit\r  zc ||J  <  £min,  £min  =  min^J  .  (17) 

If  this  holds,  then 

Zi(tk)  1 1  P-  ^min;  V  i  1,  . . . ,  A^a, 

guaranteeing  all  agents  are  in  their  terminal  constraint  sets.  Linder  stated  assumptions,  we 
will  show  that  || z[tk)  —  zc\\2v  is  contracting  with  each  update,  where  the  positive-definite, 
symmetric  weighting  W  will  be  defined  more  precisely  below.  Since  the  contraction  is  hap¬ 
pening  with  a  different  norm-weighting  than  P,  we  require  a  sufficient  test  on  the  IF-weighted 


\z{tk 


z  ft  = 


Na 

£ 

i—  1 


|  Zi(tk 


I  Pi  - 


<  £r 


quadratic  term  to  guarantee  that  equation  (17)  holds.  Recall  that  Amax(Q0)  €  M  is  the  max¬ 
imum  eigenvalue  of  any  symmetric  matrix  Q 0,  and  let  Amin(<5o)  G  K  denote  the  minimum 
eigenvalue  of  Qq.  Observe  that  if 


I z(tk)  -  zc 


h  < 
I  w  — 


Amin(hh)£rn 

Amax(-f>) 


(18) 


then  we  have 

Amin(l'R)  |  %  \\  ^  ''  ||^(tfc)  Z  ||p  ^  £min- 

Amax(^) 

The  test  on  the  IF-weighted  quadratic  term  is  more  conservative  than  testing  for  equation 
(17)  directly.  However,  since  the  IF-weighted  term  is  shown  to  strictly  decrease  for  the 
closed-loop  system,  we  are  guaranteed  invariance,  i.e.,  once  equation  (18)  is  true,  it  remains 
true  for  all  future  time.  Positive  invariance  is  required  as  it  will  take  some  time  for  the 
agents  to  agree,  in  a  distributed  way,  that  they  are  in  the  set  0(cmin). 

ft  is  required  that  5  be  no  larger  than  a  value,  denoted  Amax,  that  guarantees  monotonic 
decrease  of  the  value  function  J$(z(tk),  T)  so  that  equation  (18)  holds  after  some  finite  time. 
Now,  we  assume  some  quadratic  bounds  on  the  function  and  show  convergence  of  a 
sequence  such  that  equation  (18),  and  hence  equation  (17),  is  guaranteed  to  hold  after  some 
finite  number  of  iterations,  from  any  feasible  state  at  initialization. 

Assumption  6.  For  any  z  G  Zs,  there  exists  positive  constants  k\,k2  G  M,  k2  >  k\,  and 
positive-definite,  symmetric  matrix  W  G  WlNaXnNa  such  that 


k\ 


z 


1 2 
I W 


<  J*(z,  T)  <  k2\\z  —  zc 


2 

Wi 


where 


*  k2  >  <fmax7/Amin(f?)/Amax(hF)  and 

•  k\  ^  ^max  7  Amin  (Q)/(2A 

max  W), 

and  5max  is  defined  as 

c  _  (7  l)c  1  —  (  A  /^niin(Q)^miii 

maX  “  J  +  7Amax(g)(i?2  +  ^ax)’  W  616  C  “  V  KjW)J  4A  max(P)  ' 

The  requirement  that  k2  >  5max7Amin((3)/Amax(kF)  is  not  too  restrictive,  since  the  right-hand 
side  is  a  small  number  in  general.  The  requirement  that  k2  —  k±  <  5max7Amin(Q)/(2Amax(kF)) 
means  k2  is  close  to  k\ .  Equivalently,  this  means  that  J£(z,T)  basically  exhibits  quadratic 
growth,  with  a  IF- weighted  norm.  In  the  analysis  that  follows,  it  is  not  required  to  know  the 
values  of  k2  and  k\ ,  so  they  do  not  need  to  be  estimated.  Since  <5max  is  to  be  computed  and 
used  in  practice,  it  will  be  required  to  estimate  the  ratio  Amin(kF)/Amax(kF).  More  simply, 
if  a  reasonable  lower  bound  on  this  ratio  can  be  computed,  the  bound  can  be  used  to  define 
c,  although  this  results  in  more  conservatism.  Aside  from  computing  the  ratio,  or  a  lower 
bound  on  it,  the  weighting  matrix  W  does  not  need  to  be  computed. 
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Lemma  6.  Under  Assumptions  1  and  3-6,  for  a  given  fixed  horizon  time  T  >  0  and  for 
any  state  z(t_ i)  G  Z s  at  initialization,  if  the  update  period  5  =  5max,  then  for  the  closed-loop 
system  (8)  there  exists  a  finite  integer  l  E  N  at  which  z{t{)  G  D(emjn)  and  z(tk)  G 
for  all  k  >  l. 

Proof.  From  Lemma  3,  the  monotonicity  statement  on  J^(z(tk),T),  for  all  k  G  N,  is 


rtfc+<5  ffa 


J^(z(tk+l),T)  -  J^(z{tk),T)  <  -  /  dr  +h2^. 

dik  i= i 


From  the  Taylor  series  approximation  in  Assumption  5,  we  have 

/  Zi(tk)),  z-i(r ;  z-i(tk))) 

dtk  i=  i 

~  h\\z(tk)  -  zc\\2q  +  £27 (z(tk)  -  ^c)rQidist(4;-(4)). 

From  the  proof  of  Lemma  5, 

~(4h)  -  V)TQii,t(4;  4h))  <  A „«(Q)  (J?2  +  (72J  . 

Now,  substituting  5  =  5max,  we  can  rewrite  the  equation  bounding  J^(z(tk),T)  as 

Ji(z(tk+ 1), T)  -  4(2(4), T)  <  -V„Tlk(4)  -  z%  +  <5Lx  (?  +  7A„»(Q)  (ii2  +  C2  J) 

A j r, i 1 1  (C) ) 


E  ^max7 


Amax(fU) 

Using  the  bounds  on  J^(z(tk),T)  from  Assumption  6  we  have 


z{tk)  -  zc\ \w  +  8max  (7  -  1)  c. 


ki\\z(tk+i)  -  zc\\w  <  k2\\z(tk)  -  zc\\^  -  5max7 


Amin  (Q) 


Amax(hF) 

Denoting  yk  =  || z(tk)  —  zc ||^  G  [0,  00),  we  rewrite  this  as 

^2  ^max  7Amin(Q)/ AmaX(fU) 
Vk+ 1  <  PVk  +  (p,  where  p  =  - 


z(tk)  -  zc\\2w  +  <5max  (7  -  1)  c. 


Anax  (7  1)  C 


fc,  ’  r  k\ 

Also  from  Assumption  6,  k-2  >  Amax7'Amjn  (Q) /Amax  (ID)  and 

^2  k  1  U  5max7Amin(Q)/(Amax(hF)2)  <  $max7Amin(fi?)/Amax(bF); 

which  implies  0  <  p  <  1.  Considering  the  sequence  yk+i,  which  is  bounded  for  each  k  G  N, 
by  pyk  +  <f>,  we  observe  that 


Voo  =  lim  Vk+I  <  lim 

/c— XX)  /c— >x 


Pfcl/o 


fc-i 

+UE'2' 


.  i=0 


1  -p 
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Also,  we  can  bound  the  ratio  0/  (1  —  p)  as 


0  _  ^max  (t  1)  c  2<5maxAmax(hh)(7  l)c  ^  2Amax(VF)c 

1  -  p  “  fci  +  5  max  7  -^min  (Q)/ A  max  W-  ^2  Anax7  ^min  (Q)  ^min  (Q)  ’ 


where  the  first  inequality  uses  the  assumed  upper  bound  on  fc2  —  fci .  From  the  definition  of 
c  we  have 

2A 

max  (fF)c _  Amin(fF)£min 

Amin(Q)  2Amax(P) 

Therefore,  denoting  =  ||^(too)  —  zc|| we  have 


|^(^oo)  2 


1 2 
I  TV 


< 


Amin  (WQgmi 

2Amax(P) 


|^(^oo)  ^ 


Cl  1 2 


< 


^•min 

~2~9 


and  so  z(t oo)  is  in  the  interior  of  fl(£min).  Moreover,  for  any  yk  —  '(}  +  0/(1  —  p),  where 
d  G  (0,  cx)),  the  sequence  bound  yk+\  <  PVk  +  0  guarantees  strict  monotonic  decrease  of 
the  sequence.  The  reason  is  pyk  +  0  —  yk  =  —  (1  —  p)yk  +  0  =  —'()  —  0  +  0  =  — d,  and  so 
yk+ 1  <  yk  —  d.  In  particular,  once  yk  <  20/(1  —  p),  we  are  guaranteed  that  equation  (17) 
holds,  since  the  factor  of  2  simply  removes  the  1/2  in  the  implication  above. 

Now,  there  is  a  finite  integer  l  e  N  for  which  yi  <  20/(1  —  p).  If  this  were  not  the  case, 
Vk+i  <  yk  ~  ’0  would  hold  for  all  k  G  N,  which  implies  yk  — >  —  oo  as  k  — »  oo.  However,  this 
contradicts  the  fact  that  yk  >  0  for  all  k  G  N.  Therefore,  there  exists  a  finite  integer  l  G  N 
for  which  yi  <  20/(1  —  p).  Also,  since  the  sequence  is  required  to  continue  to  decrease  up  to 
at  most  the  limit  bound  on  0/(1  —  p),  we  have  positive  invariance  as  well.  This  concludes 
the  proof.  ■ 


Remark  13.  If  the  update  period  <5  is  less  than  <5max,  then  the  analysis  above  still  holds, 
provided  the  bound  on  the  difference  —  k\  in  Assumption  6  is  tightened  by  replacing  Amax 
by  A 

For  the  control  switch  to  occur  synchronously  between  the  agents,  we  require  a  distributed 
means  of  determining  when  equation  (18)  holds.  Since  Lemma  6  guarantees  monotonic 
decrease  in  the  IT-norm  sense,  we  shall  cast  the  test  in  terms  of  the  IT-norm.  Although  this 
incurs  more  conservatism,  it  implies  that  the  agents  will  not  come  to  agreement  unless  they 
have  all  reached  the  state  for  which  all  subsequent  receding  horizon  updates  render  0(£min) 
a  positively  invariant  set  of  the  closed-loop  system. 

Distributed  Consensus  Algorithm  for  Synchronous  Control  Switching  [24], 

The  algorithm  defined  here  is  a  distributed  means  of  determining  when  equation  (18)  holds. 
By  distributed,  we  mean  each  agent  communicates  only  with  neighboring  agents.  For  the 
algorithm  to  converge,  a  minimum  number  of  information  exchanges  is  required.  We  elect  to 
have  a  separate  sample  rate  for  this  algorithm.  Over  the  time  interval  [tk,tk+ 1),  we  assume 
neighboring  agents  will  communicate  Ns  times,  using  notation 

Tk,i  =  tk  +  S(l/Ns),  l  =  0,1, ...,  Ns  -  1, 
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to  denote  the  times  at  which  neighbors  exchange  information. 

To  define  the  dynamics  of  the  switch  mechanism,  we  introduce  some  notation.  At  any 
algorithm  update  time  rk.i  and  for  any  agent  i.  let  Xi(rk,i )  G  M+  be  a  non-negative  scalar 
value.  Also,  denote  e0  =  £min /Amax(-P)  anfl  ^  —  Amax(hh)/Amin(hh),  where  W  is  the  weighting 
matrix  dehned  in  Assumption  6.  As  with  the  definition  of  c  in  Assumption  6,  a  reasonable 
upper  bound  for  this  ratio,  or  equivalently  a  lower  bound  on  Amin(bF)/Amax(bF),  could  be 
used  in  the  place  of  ip  below.  The  algorithm  is  as  follows.  For  every  agent  %  —  1, ...,  Na: 

1.  At  any  time  rkfl  =  tk,  set 

Xi(rk, o)  =  Naip\\zi(tk)  -z\ ||2, 

transmit  Xi{rkfi)  and  receive  Xj{jk, o)  from  each  neighbor  j  G  J\ft. 


2.  For  each  time  rkj,  l  —  1,  2, ...,  Ns  —  1, 
(a)  set 


Xi{rKi) 


Xi(Tk,l-  i)  + 


J-  (xi(TM- 1) 

s  jeMi 


l))  i 


where  (  >  0,  and 

(b)  transmit  Xi{jk,i )  and  receive  Xj(Tk,i )  from  each  neighbor  j  G  Mi- 

3.  Define  Xi(jk,Na)  according  to  the  equation  in  2(a)  above.  If 

xi(Tk,Ns )  <  A)  —  D 


where  e  is  a  specified  tolerance  satisfying  0  <  e  -C  £o,  then  switch  at  time  tk+i  and  exit 
the  algorithm.  Otherwise,  return  to  1. 

End  of  algorithm. 

Under  the  conditions  stated  in  a  lemma  below,  namely  if  Ns  is  sufficiently  large,  then 


\xi(rk,Ns)  -  Ave(x(tk))\  <  e,  Vi  =  1,  ...,Na, 


where 


Na 


Na 


A ve(x(tk))  =  —  ^2xi(tk)  =  ip^\\zi(tk)  -  z- 1|2  =  Ip\\z(tk)  -  Z 

a  i= 1  i= 1 

From  this,  we  have 

ip\\z{tk)  -  zc\\2  -  e  <  Xi(Tk,N J  <  Ip\\z(tk)  -  zc\\2  +  e 
Therefore,  the  test  in  part  3  of  the  algorithm  guarantees  that 

!K4)  -  Z°\lw  <  Amin(W)£0 

and  equation  (18)  holds. 


cm2 


32 


Lemma  7.  For  a  specified  tolerance  e,  satisfying  0  <  e  -C  £q,  the  distributed  consensus 
algorithm  for  synchronous  control  switching  converges  in  Ns  iterations  provided  that 

•  Ns  >  (dm,  where  dm  is  the  maximum  node  degree  of  the  graph  Q,  and 

•  denoting  A  =  1  —  ((/NS)X2(L),  where  L  is  the  Laplacian  of  the  graph  Q  and  X 2{L)  is 
the  second  largest  eigenvalue  of  L, 

cl„=  (x,(rk„)  -  Ave(x(tk)))2  , 

l°gA  [U  \ 

where  d0  denotes  a  measure  of  initial  disagreement. 

It  is  proven  in  [24]  that  the  eigenvalue  X2(L)  bounds  the  rate  of  convergence  of  the  average 
consensus  algorithm  in  continuous  time.  Converting  to  discrete  time,  the  convergence  bound 
becomes  A  defined  above,  and  the  first  condition  in  the  lemma  implies  0  <  A  <  1.  The 
lemma  says  that  if  Ns  is  sufficiently  large,  every  agent  will  meet  the  tolerance  specified  in 
the  algorithm.  Therefore,  the  test  in  step  3  of  the  algorithm  implies  the  agents  agree  to 
synchronously  switch  to  the  linear  feedback  controllers  only  if  equation  (18)  holds,  i.e.,  if  all 
agents  are  in  the  interior  of  their  terminal  constraint  sets.  Note  that  if  do  <  e,  consensus  has 
been  reached  from  the  initial  values  for  each  ay,  and  the  algorithm  would  terminate  in  one 
iteration.  The  results  of  the  lemma  presumes  that  initially  do  >  e,  as  is  the  case  in  practice. 
With  0  <  A  <  1  and  d0  >  e,  the  second  condition  on  Ns  provides  a  lower  bound  that  is 
positive. 

Remark  14.  Since  Ns  will  be  large  in  general,  the  communication  requirements  between 
receding  horizon  updates  will  be  demanding  for  the  algorithm  to  converge.  To  alleviate  this, 
observe  that  from  the  invariance  property  stated  in  Lemma  6,  we  know  that  once  equation 
(18)  holds,  it  will  continue  to  hold.  As  such,  it  is  possible  to  communicate  once  every  receding 
horizon  update.  This  is  done  by  defining  TkNs,i  =  tkNs  +  SI,  so  step  1  is  entered  every  5NS 
seconds.  The  tradeoff  of  course  is  that  the  algorithm  will  take  considerably  more  time  to 
converge.  A  sample  rate  between  these  two  extremes  could  be  used,  one  that  is  appropriate 
for  the  given  bandwidth  limitations. 

We  now  define  the  dual-mode  distributed  receding  horizon  controller. 

Definition  10.  ( Dual-mode  distributed  receding  horizon  controller ) 

Data :  Initial  state  z(t- 1)  G  Zs,  horizon  time  T  >  0,  update  time  5  =  <5max- 

Initialization:  At  time  f_1;  follow  the  procedure  given  in  Definition  7,  yeiding  a  control 
for  time  t  G  [t_i,  t0]- 

Controller: 

1.  For  t  G  [tofii],  employ  the  distributed  receding  horizon  control  law  z(to)) . 


33 


2.  At  any  time  tk,  k  EN: 

(a)  If  the  distributed  consensus  algorithm  has  converged  (step  3),  employ  the 
decoupled  linear  feedbacks  defined  in  Assumption  3  for  all  time  t  >  tk-  Else: 

(b)  Employ  the  distributed  receding  horizon  control  law  u^ist{t]  z(tk))  for  time 
t  €  [tk,  tk+i]- 

Lemma  6  guarantees  that,  under  the  assumptions,  the  inequality  in  equation  (18)  will  hold 
after  a  finite  number  l  E  N  of  receding  horizon  updates.  Under  the  condition  of  Lemma 
7,  the  agents  will  agree  to  switch  at  time  ti+1  to  the  decoupled  linear  feedback  controllers. 
Since  equation  (18)  holds,  the  agents  are  known  to  be  in  a  set  for  which  these  feedbacks  are 
asymptotically  stabilizing.  Therefore,  the  dual-mode  distributed  receding  horizon  controller 
results  in  asymptotic  stability  with  region  of  attraction  Zv.  Formally,  we  now  state  this  as 
the  second  main  theorem  of  the  paper. 

Theorem  3.  Under  Assumptions  1  and  3-6  and  under  the  conditions  of  Lemma  7,  for  a 
given  fixed  horizon  time  T  >  0  and  for  any  state  z(t- 1)  G  Z s  at  initialization,  if  the  update 
period  5  =  Smax,  then  zc  is  an  asymptotically  stable  equilibrium  point  of  the  closed-loop  system 
resulting  frorn  the  dual-mode  distributed  receding  horizon  controller,  and  is  a  region  of 
attraction. 

Remark  15.  The  distributed  receding  horizon  control  law  of  Theorem  2  requires  that  all 
agents  have  the  following  information  available:  the  horizon  time  T,  the  update  period  6, 
and  all  parameters  in  the  computation  for  S(z(tk))  in  equation  (16),  which  includes  the 
centralized  computation  of  || z(tk)  —  zc\\ q  at  each  receding  horizon  update.  The  dual-mode 
distributed  receding  horizon  control  law  of  Theorem  3  requires  that  all  agents  have  the 
following  information  available:  the  horizon  time  T,  all  parameters  in  the  computation  for 
5max  hi  Assumption  6,  and  the  parameters  for  the  distributed  consensus  algorithm,  satisfying 
the  conditions  in  Lemma  7.  Clearly,  both  controllers  require  some  centralized  information; 
however,  the  dual-mode  version  does  not  require  any  on-line  centralized  computations.  We 
note  that  for  the  controller  of  Theorem  2,  another  consensus  algorithm  could  be  incorporated 
for  distributed  computation  of  || z(tk)  —  zc\\ q  at  each  receding  horizon  update. 

4.3  Alternative  Formulations 

In  this  section,  we  discuss  two  independent  modifications  to  the  distributed  receding  horizon 
control  approach  given  in  Section  4.1.  First,  we  briefly  explore  the  implications  of  transmit¬ 
ting  assumed  position  information,  instead  of  assumed  control  information,  between  neigh¬ 
boring  agents.  Next,  the  effects  of  using  a  position  comparison  constraint,  rather  than  the 
control  comparison  constraint,  in  each  distributed  optimal  control  problem  is  investigated. 

The  distributed  optimal  control  problems  here  require  only  the  assumed  position  tra¬ 
jectories  from  each  neighbor.  As  such,  neighboring  agents  could  instead  exchange  assumed 
position  trajectories,  rather  than  assumed  control  trajectories.  The  result  would  be  that 
agents  would  then  not  have  to  integrate  the  equations  of  motion  for  each  neighboring  agent, 
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and  also  not  require  a  separate  transmission  to  obtain  the  initial  condition  for  each  neighbor¬ 
ing  agent.  The  result  is  less  communication  between  neighboring  agents  and  simplification 
in  the  optimal  control  computations.  In  other  problems,  however,  it  may  be  that  the  state 
information  of  neighbors  required  in  each  local  optimal  control  problem  is  more  demanding. 
This  would  be  the  case  here,  for  example,  if  the  distributed  integrated  costs  depended  also 
on  the  assumed  velocity  of  every  neighbor.  In  such  cases,  the  communication  requirements 
are  lower  by  sharing  initial  state  and  assumed  control  trajectory  information  at  each  update. 
To  generalize,  the  tradeoff  between  exchanging  control  or  state  trajectory  information,  in 
terms  of  the  overall  communication  and  computation  requirements,  should  dictate  how  the 
needed  assumed  information  should  be  attained. 

We  now  discuss  some  implications  of  replacing  each  control  comparison  constraint  with 
a  position  comparison  constraint.  The  control  comparison  constraint  is 

1 1  Ui  (s,  Zi(t  fc)  )  Ui  (s,  Zj (4) )  1 1  ^  8  hi,  S  G  [4)4  T  T] , 

whereas  a  position  comparison  constraint  is 

|| qi(s\  Zi(tk))  -  qi(s; Zi(tk))\\  <  52n,  s  G  [tk,tk  +  T\. 

It  is  the  proof  of  Lemma  3  that  requires  a  bound  on  the  norm  of  the  difference  between  the 
assumed  position  trajectories  and  the  actual  position  trajectories.  If  the  control  comparison 
constraint  is  replaced  by  the  position  comparison  constraint,  the  result  in  the  lemma  still 
follows  by  substituting  the  constraint  bounds  directly  into  the  bounding  argument.  The 
resulting  constant  £  is  then  redefined  to  be 

£  =  jluk,T(4M  +  4/3)  [R  +  nT2]  . 

Now,  £  grows  as  a  lower-order  function  of  horizon  time  T,  with  effectively  the  same  growth 
relation  to  the  other  parameters  as  before.  The  upper-bound  on  the  update  period  <5,  defined 
as  8max  or  5(z(tk ))  in  equation  (16),  is  proportional  to  l/(£  +  c),  where  c  is  a  constant.  Thus, 
the  upper-bound  on  8  is  potentially  larger,  i.e. ,  less  conservative,  for  large  horizon  times,  when 
using  the  position  comparison  constraint.  Therefore,  for  a  given  k  and  5  that  satisfies  the 
theoretical  bounds,  the  comparison  constraint  bound  82n  could  be  larger  using  the  position 
comparison  constraint,  when  T  is  large. 

We  now  discuss  another  reason  that  the  position  comparison  constraint  may  be  favorable. 
Assume  that  from  a  given  initial  condition  z(t o),  it  takes  N0  iterations  to  reach  the  terminal 
constraint  set  using  the  distributed  receding  horizon  control  law,  with  either  control  or 
position  comparison  constraint.  From  the  proof  of  Lemma  3,  recall  that  with  the  control 
comparison  constraint 

II Qdj(T;  Zj{tk))  -  qj{T\Zj{tk)) ||  <  ^52(t  -  4)2- 

As  a  result  we  have 

hZ  HZ 

lk(4+i)  -  q(tk)\\  <  ~Y~,  for  each  fcGN  =>  \\q(tNo)  ~  q(t0)  ||  <  No^- 
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On  the  other  hand,  from  the  position  comparison  constraint  we  have 


llg(fjvo)  -  ?(*o)||  <  N052k. 

A  reasonable  assumption  is  that  N08  is  a  constant,  meaning  that  the  number  of  iterations  is 
proportional  to  the  receding  horizon  sample  rate.  The  equations  above  imply  that  the  closed- 
loop  position  trajectory  will  deviate  from  the  position  trajectory  at  initialization  by  a  factor 
of  53  with  the  control  comparison  constraint,  and  by  a  factor  of  5  with  the  position  comparison 
constraint.  The  theory  requires  the  5  be  sufficiently  small.  Therefore,  obeying  the  sufficient 
conditions  in  the  theory  implies  that  the  receding  horizon  controller  will  resemble  the  initial 
open-loop  solution  more  when  using  the  control  comparison  constraint. 

Suppose  now  that  the  linear,  homogeneous  vehicle  dynamics  are  replaced  by  the  more 
generic  nonlinear  dynamics 


Zi(t)  =  fi(zi(t),Ui(t)),  for  each  i  =  1, ...,  Na. 

The  dynamics  above  require  additional  assumptions,  e.g.,  stabilizability  around  each  respec¬ 
tive  equilibrium  point  zf,  as  specified  in  the  formulation  in  [6].  Suppose  that  the  cost  is 
still  quadratic,  with  every  distributed  integrated  cost  Lj  depending  upon  each  neighboring 
state  Zj  (or  some  components  of  the  state),  for  j  e  A/).  Again,  the  proof  of  Lemma  3  will 
require  a  bound  on  the  norm  of  the  difference  between  the  assumed  trajectories  i*  and  the 
actual  trajectories  z* ,  for  every  i  =  1  ,...,Na.  If  the  comparison  constraint  is  in  terms  of 
the  state  zl}  then  the  theoretical  results  follow  immediately,  since  the  constraint  bounds  can 
be  directly  substituted  into  the  bounding  argument  in  the  proof.  If  the  control  comparison 
constraint  is  used,  however,  achieving  a  bound  on  the  difference  between  assumed  and  actual 
state  trajectories  becomes  more  cumbersome.  Finally,  we  note  that  the  analysis  in  [6]  can 
be  used  for  construction  of  the  distributed  terminal  cost  and  constraint  functions. 

From  a  numerical  point  of  view,  the  control  comparison  constraint  may  be  more  appro¬ 
priate.  Specifically,  consider  the  case  where,  for  a  given  horizon  time  T,  the  value  of  the 
allowable  product  52k  is  not  too  different  for  the  control  and  position  comparison  constraint 
formulations.  The  control  comparison  constraint  implies  that 

\\q.i(s-,  Zi(tk))  —  qi(s;  Zi(tk))\\  <  (s  -  tk)2S2n/ 2,  sG  [tk,tk  +  T\. 

Thus,  we  have  a  parabolic  constraint  on  the  position,  and  for  s  >  tk  +  \/2,  this  constraint  is 
less  stringent  than  the  position  comparison  constraint.  If  T  is  large,  then,  each  distributed 
optimization  problem  is  generally  easier  with  the  control  comparison  constraint,  in  the  sense 
that  the  position  trajectory  is  not  constrained  to  be  as  close  to  the  previous  trajectory,  as 
time  proceeds. 

The  observations  above  should  be  taken  in  the  light  that  the  theoretical  conditions  are 
conservative.  In  practice,  values  for  5  much  larger  than  those  specified  by  the  theory  are 
successful.  Moreover,  the  control  comparison  constraint  has  been  tested  in  simulations,  with 
success,  while  the  position  comparison  constraint  has  yet  to  be  tested.  As  part  of  our  future 
work,  we  shall  compare  the  performance  of  the  two  formulations  in  simulations. 
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In  the  next  section,  formations  of  vehicles  are  stabilized  using  the  centralized  and  dis¬ 
tributed  receding  horizon  controllers  defined  in  the  previous  sections.  The  simulations  reveal 
that  for  a  fixed,  small  value  for  the  update  period  6,  convergence  is  obtained  with  good  ac¬ 
curacy.  Moreover,  the  performance  of  the  distributed  implementation  is  comparable  to  that 
of  the  centralized  implementation. 


5  Formation  Stabilization  Example 


A  simulation  of  a  four  vehicle  formation  is  presented  in  this  section.  The  dimension  of  the 
position  vector  for  all  agents  is  two  (n  =  2).  The  state  and  control  constraint  sets  are  defined 
as 

Z  =  M4,  U  =  {(mi,m2)  G  M2  :  —  1  <  Uj  <1,  j  —  1,  2}  . 

The  objective  is  a  fingertip  formation  that  tracks  the  reference  trajectory  (gref(t),  qref(t))  G 
M4,  defined  as 


t  G  [0, 10) 
t),  t  G  [10,  oo)  ’ 


(19) 


where  to  =  0  in  the  notation  of  the  previous  sections.  The  acceleration  qref(t )  is  zero  for  all 
time  except  at  t  =  10  seconds.  To  be  consistent  with  the  cooperative  stabilization  objective, 
we  rewrite  the  system  dynamics  in  equation  (1)  in  error  form.  In  particular,  the  error  system 
for  any  agent  i  has  state  (g*  —  qre f,  g*  —  gref)  and  dynamics  g)  =  w*.  The  jump  in  the  reference 
velocity  at  time  t  =  10  serves  to  examine  how  well  the  error  dynamics  are  stabilized  for  two 
different  legs  of  the  reference  trajectory.  Equivalently,  the  problem  is  to  stabilize  the  error 
dynamics  from  an  initial  condition  at  time  0,  and  then  again  from  the  current  state  at  time 
10. 

To  eliminate  any  offset  between  the  center  of  geometry  of  the  formation  and  the  reference 
trajectory,  we  set  the  formation  path  to  qci{t)  =  (0,  0)  for  all  t  >  0.  The  vector  formation 
graph  is  defined  by  vertices  V  =  {1,2,  3, 4}  and  relative  vectors  S  =  {(1, 2),  (1,  3),  (2, 4)}. 
As  in  the  generalization  of  Section  2,  the  core  vehicles  associated  with  the  tracking  cost  are 
{1,2,3}.  The  relative  vectors  are  defined  for  the  two  legs  of  the  reference  trajectory  as 


r  (-2,1),  t  e  [o,io)  _  r  (-2,  -1),  t  e  [o,  io) 

*■-*•- l  (1,2),  t  6  [10,  oo)  ’  dl3~  \  (-1,2),  t  e  [10,  oo) 

The  common  rotation  in  the  vectors  at  time  t  —  10  is  match  the  heading  of  the  fingertip 
formation  with  the  heading  of  the  reference  trajectory.  The  initial  conditions  for  each  agent 
are  given  as  gi(0)  =  (—1,2),  g2(0)  =  (-4,0),  g3(0)  =  (-2,0)  and  g4(0)  =  (-7,-1),  with 
g.,(0)  =  (0,0)  for  each  agent  i  G  V.  In  both  centralized  and  distributed  receding  horizon 
implementations,  a  horizon  time  of  T  =  5.0  is  used.  Also,  the  following  weighting  parameter 
values  are  consistent  in  both  implementations:  u  =  2.0,  v  =  1.0  and  //  =  2.0.  As  stated, 
collision  avoidance  is  not  incorporated  in  the  optimal  control  problems,  either  by  cost  or 
constraint.  In  all  simulation  results  presented,  no  collisions  were  observed  to  occur. 
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To  solve  the  optimal  control  problems  numerically,  we  employ  the  Nonlinear  Trajectory 
Generation  (NTG)  software  developed  at  Caltech.  A  detailed  description  of  NTG  as  a  real¬ 
time  trajectory  generation  package  for  constrained  mechanical  systems  is  given  in  [21].  The 
package  is  based  on  finding  trajectory  curves  in  a  lower  dimensional  space  and  parameterizing 
these  curves  by  B-splines.  Sequential  quadratic  programming  (SQP)  is  used  to  solve  for  the 
B-spline  coefficients  that  optimize  the  performance  objective,  while  respecting  dynamics  and 
constraints.  The  package  NPSOL  [12]  is  used  to  solve  the  SQP  problem. 

For  the  centralized  receding  horizon  control  law,  parameter  values  in  the  optimal  control 
problem  must  be  chosen  to  guarantee  that  Assumption  2  is  true.  For  the  weights  chosen 
above,  K  is  defined  as  the  linear  quadratic  regulator  and  P  the  corresponding  stable  solution 
to  the  algebraic  Riccati  equation.  Choosing  a  =  0.4  implies  that  the  assumption  (i)  is  true. 
To  prove  this,  define  y  =  z  —  zc  and  observe  that 

yTPy  <a  yy  Xmilx(P)yTPy  <  Amax(P)ct  =>■  yTP'2y  <  Amax(P)a 
=*  yTPBBTPy<\max(P)a  &  yTKTKy  <  X^h. 

Choosing  a  <  y2/\ max(P)  ~  0.4  guarantees  that  ||Ah/||J  <  1.  Finally,  the  latter  condition 
guarantees  that  K(z—zc)  G  IAN"  for  all  z  G  f2(a),  since  each  component  of  K y  will  be  between 
-1  and  1  for  all  time.  For  an  update  period  of  5  =  0.5,  centralized  receding  horizon  control  of 
the  fingertip  formation  is  shown  in  Figure  2.  The  four  closed-loop  position  trajectories  of  the 
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Figure  2:  Four  vehicle  fingertip  formation  using  centralized  receding  horizon  control, 
vehicles  are  shown  in  the  figure,  with  each  vehicle  depicted  by  a  triangle.  The  heading  of  any 
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triangle  shows  the  direction  of  the  corresponding  velocity  vector.  The  symbols  along  each 
trajectory  mark  the  points  at  which  the  receding  horizon  updates  occur.  The  legend  identifies 
a  symbol  with  a  vehicle  number  for  each  trajectory.  The  triangles  show  the  position  and 
heading  of  each  vehicle  at  snapshots  of  time,  specifically  at  0.0,  6.0,  12.0  and  18.0  seconds. 

Also  shown  at  these  instants  of  time  are  the  reference  trajectory  position  gref(t),  identified 
by  the  black  square,  and  the  average  position  of  the  core  vehicles  qs(t),  identified  by  the 
yellow  square.  The  tracking  part  of  the  cooperative  objective  is  achieved  when  qs(t)  =  gref(f), 
i.e.,  when  the  two  squares  are  perfectly  overlapping.  At  time  6.0,  the  vehicles  are  close  to 
the  desired  formation,  and  the  squares  are  nearly  overlapped,  indicating  that  the  tracking 
objective  is  being  reached.  After  8.0  seconds,  the  formation  objective  has  been  met  to  a 
numerical  precision  of  0.01,  which  is  the  value  of  the  the  optimal  cost  function  at  that  time. 
At  time  12.0,  the  snapshot  shows  the  formation  reconfiguring  to  the  change  in  heading  of 
the  reference  trajectory  which  occurred  at  time  10.0.  At  time  18.0,  the  objective  has  again 
been  met  and  the  optimal  cost  function  has  a  numerical  value  of  less  that  0.01. 

The  receding  horizon  control  law  time  history  for  agent  3  is  shown  in  Figure  3.  At  receding 


Receding  Horizon  Control  of  Agent  3 


Figure  3:  Centralized  receding  horizon  control  law  time  history  for  agent  3. 

horizon  updates,  the  control  is  not  required  to  initially  match  the  last  control  value  applied. 
Consequently,  the  resulting  closed-loop  control  will  be  discontinuous  in  general.  The  figure 
shows  greater  discontinuity  during  the  transient  phase  of  the  closed-loop  response,  with  the 
largest  discontinuity  occurring  at  time  0.0  and  at  time  10.0,  when  the  reference  trajectory 
changed  heading. 
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For  the  distributed  receding  horizon  implementation,  the  initial  state  at  time  0.0  is  used 
for  initialization,  as  described  in  Definition  7.  In  terms  of  the  notation,  we  thus  have  t_ i  = 
0.0.  Regarding  the  conditions  in  Assumption  3,  we  first  choose  Qi  =  Amax(<5)7(4),  where 
Amax(Q)  ~  6.85.  As  in  the  centralized  case,  Kj  is  defined  as  the  linear  quadratic  regulator 
and  Pi  the  corresponding  stable  solution  to  the  algebraic  Riccati  equation.  Following  the 
steps  above,  we  can  show  that  cq  =  0.33  guarantees  that  the  conditions  in  the  assumption 
will  hold.  Finally,  we  set  7  =  2  in  the  cost  functions  of  the  distributed  optimal  control 
problems.  The  distributed  receding  horizon  controller  is  applied  for  all  time  and  switching 
to  the  decoupled  feedbacks  is  not  employed. 

Before  employing  the  distributed  receding  horizon  control  law  define  in  Section  4.1,  we 
first  explore  what  happens  when  neighbors  are  assumed  to  have  zero  control  and  the  control 
comparison  constraint  is  not  enforced  (k  =  +00).  This  was  explored  in  simulations  in  a 
previous  paper  [10].  In  other  words,  neighbors  are  assumed  to  continue  along  straight  line 
paths  over  any  optimization  horizon.  For  an  update  period  of  6  =  0.5,  the  result  is  shown 
in  Figure  4.  The  receding  horizon  control  law  time  history  for  agent  3  is  shown  in  Figure  5. 
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Figure  4:  Four  vehicle  formation  using  distributed  receding  horizon  control,  assuming  neigh¬ 
bors  continue  along  straight  line  paths  at  each  update  and  without  enforcing  the  control 
comparison  constraints  (k  =  +00). 

The  response  is  characterized  by  overshoot,  as  agents  believe  neighbors  will  continue  along 
vectors  tangent  the  path  over  the  entire  optimization  horizon  at  each  update.  The  figure 


40 


Receding  Horizon  Control  of  Agent  3 
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Figure  5:  Distributed  receding  horizon  control  law  time  history  of  agent  3.  Left  figure  shows 
time  history  when  8  =  0.5,  and  right  figure  shows  time  history  when  8  =  0.1. 

shows  the  position  and  heading  triangles  at  the  same  snapshots  of  time  as  before,  plus  one 
additional  snapshot  at  time  24.0.  As  time  grows,  the  formation  is  observed  to  get  closer  to 
meeting  the  formation  objective,  but  only  after  a  long  time.  Moreover,  the  formation  never 
meets  the  objective  to  acceptable  precision. 

Interestingly,  for  this  given  weights  in  the  cost  function,  the  closed-loop  performance  is 
observed  to  stay  the  same  even  if  the  update  period  is  decreased.  In  particular,  if  8  is  reduced 
to  0.25  or  0.1,  the  response  is  nearly  the  same.  For  example,  the  receding  horizon  control 
law  time  history  for  agent  3  with  8  =  0.1  is  also  shown  in  Figure  5.  As  in  the  previous  paper 
[10],  other  parameters  were  observed  to  improve  the  performance  in  this  case.  Specifically, 
increasing  the  damping  weighting  v  reduces  the  overshoot  effect,  although  the  response  of 
the  formation  naturally  becomes  more  sluggish.  Also,  if  the  horizon  time  T  is  shortened, 
overall  performance  improves,  as  the  assumption  becomes  more  valid.  The  reason  is  that 
a  straight  line  approximation  is  generally  a  valid  approximation  locally,  and  shrinking  T 
means  the  assumption  should  hold  over  a  more  local  domain,  relative  to  larger  values  of  T . 

In  the  formulation  in  [16],  where  each  agent  optimizes  for  itself  as  well  as  for  neighboring 
agents,  a  similar  effect  is  observed.  There,  agents  assume  that  neighbors  will  react  solely 
with  regard  to  the  local  cost  function  and  constraints.  Apparently,  such  a  self-interested 
philosophy  is  not  too  bad  if  agents  are  not  looking  too  far  into  the  future,  since  initial 
conditions  are  consistent  in  all  distributed  optimization  problems  at  each  update. 

Now,  we  consider  the  performance  of  the  distributed  receding  horizon  control  law  defined 
in  Section  4.1.  After  initialization,  the  control  comparison  constraint  is  enforced,  setting 
k  —  2.  Although  the  norm  in  each  control  comparison  constraint  is  defined  to  be  the 
Euclidean  2-norm,  we  implement  the  oo-norm,  since  it  is  a  linear  constraint  and  therefore 
easier  for  the  optimization  algorithm.  For  an  update  period  of  8  =  0.5,  centralized  receding 
horizon  control  of  the  fingertip  formation  is  shown  in  Figure  6.  As  before,  the  triangles  show 
the  position  and  heading  of  each  vehicle  at  the  time  snapshots  of  0.0,  6.0,  12.0  and  18.0 
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Figure  6:  Four  vehicle  formation  using  distributed  receding  horizon  control. 


seconds.  The  performance  is  close  to  that  of  the  centralized  implementation.  At  snapshot 
time  6.0,  the  formation  is  slightly  lagging  the  reference,  compared  to  the  centralized  version. 
Also,  vehicles  1  and  3  in  particular  slightly  overshoot,  in  comparison  to  their  centralized 
counterparts,  when  the  reference  heading  changes  at  time  10.0.  At  time  18.0,  the  formation 
objective  is  close  to  being  met,  and  for  slightly  more  time  the  same  precision  as  the  centralized 
implementation  is  achieved.  The  distributed  receding  horizon  control  law  time  history  for 
agent  3  is  shown  in  Figure  7.  In  the  control  comparison  constraint,  we  have  that  82k  = 
0.5.  As  a  byproduct  of  this  constraint,  the  allowable  size  of  discontinuity  in  the  closed- 
loop  control  at  receding  horizon  update  times  is  reduced.  In  particular,  for  the  centralized 
implementation,  the  control  can  jump  by  as  much  as  2,  e.g.,  from  -1  to  1.  However,  in 
the  distributed  implementation  with  the  control  comparison  constraint,  since  we  used  the 
oo-norm,  each  component  of  the  control  can  jump  by  0.5  at  most.  The  control  in  Figure 
7  shows  a  few  places  where  such  maximal  discontinuities  occur,  specifically  just  after  time 
10.0  when  the  reference  trajectory  changes  heading. 

Since  agents  are  relying  on  the  assumption  that  neighbors  keep  doing  what  they  were 
doing,  and  the  control  comparison  constraint  ensures  that  the  assumption  is  not  too  far  off, 
stability  is  ensured.  In  fact,  if  the  comparison  constraint  is  removed,  stability  is  observed 
in  the  simulation  for  the  chosen  parameter  values  above.  The  sensitivity  to  horizon  time, 
as  observed  above  when  neighbors  are  assumed  to  continue  along  straight-line  paths,  and  as 
observed  in  the  formulation  in  [16],  is  no  longer  present. 

Regarding  the  communication  requirements  of  transmitting  assumed  controls  to  neigh¬ 
boring  agents,  in  the  NTG  formulation  corresponding  to  the  simulations  above,  14  B-spline 
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Figure  7:  Distributed  receding  horizon  control  law  time  history  for  agent  3. 


coefficients  specified  the  two-dimensional  assumed  control  trajectory  of  each  agent.  In  com¬ 
parison,  when  agents  assume  neighbors  continue  along  straight  lines,  4  numbers  much  be 
communicated  at  each  update,  representing  the  initial  condition  of  the  state  at  the  update 
time.  Thus,  such  representations  of  trajectories  in  the  optimization  problem  can  aid  in 
keeping  the  communication  requirements  closer  to  that  of  other  decentralized  schemes  [8]. 

6  Conclusions  and  Extensions 

We  have  shown  under  what  assumptions  a  centralized  optimal  control  problem,  whose  cost 
couples  the  states  of  a  set  of  dynamically  decoupled  subsystems,  could  be  decomposed  into  a 
set  of  distributed  optimal  control  problems  for  a  distributed  receding  horizon  implementation. 
The  implementation  requires  an  additional  constraint  in  the  local  optimal  control  problems, 
namely  a  constraint  ensuring  that  assumed  and  applied  control  trajectories  not  deviate  too 
far  from  one  another.  Asymptotic  stability  is  proven  in  the  absence  of  uncertainty  and 
for  sufficiently  fast  receding  horizon  updates.  Although  the  details  of  the  theory  here  are 
specific  to  homogeneous  and  linear  dynamics,  the  approach  is  general  and  the  nonlinear, 
inhomogeneous  case  is  worked  out  elsewhere  [9]. 

As  discussed  in  Section  4.3,  the  theory  is  quite  conservative  in  that,  when  the  update 
period  is  small  enough  to  satisfy  the  theoretical  conditions  for  asymptotic  stability,  the 
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control  comparison  constraint  implies  the  closed-loop  trajectories  must  remain  close  to  the 
trajectories  computed  at  initialization.  This  implication  is  not  a  desirable  property,  as 
it  takes  away  from  the  power  of  the  receding  horizon  philosophy,  namely,  the  ability  to 
recompute  an  optimal  action  based  on  current  conditions.  Still,  in  practice,  larger  values  for 
the  update  period  than  required  by  the  theory  achieve  performance  that  is  comparable  to  the 
centralized  implementation.  Less  conservative  results  will  be  one  objective  in  onr  ongoing 
research. 

The  inter-agent  communication  requirements  of  onr  approach  are  more  intensive  than 
that  of  decentralized  feedback  control  [8].  The  tradeoff  is  that  the  power  of  an  optimization- 
based  approach  is  available,  namely,  the  ability  to  address  performance  in  the  presence 
of  generic  constraints  and  nonlinearities.  Moreover,  no  particular  structure  in  the  overall 
system  dynamics  is  required;  all  agents  could  have  different  dynamics.  We  also  note  that 
the  dimension  of  each  distributed  optimal  control  problem  is  equal  to  that  of  an  optimal 
control  problem  of  the  single  corresponding  agent.  Thus,  there  is  considerable  improvement 
in  tractability  over  the  centralized  problem,  particularly  when  the  number  of  agents  Na 
is  large.  If  the  trajectories  are  known  to  be  sufficiently  smooth,  and  polynomial-based 
approximations  are  valid,  the  communication  requirements  need  not  be  substantially  worse 
than  that  of  standard  decentralized  schemes. 

We  should  also  emphasize  that  the  multi-vehicle  formation  stabilization  problem  is  simply 
a  venue.  In  other  problems  where  the  performance  objective,  specifically  the  integrated  cost, 
is  decomposable  in  such  a  way  that  the  summation  recovers  the  centralized  cost,  the  approach 
is  applicable.  Also,  if  constraints  that  couple  the  states  and/or  controls  of  neighboring  agents 
are  present,  the  distributed  implementation  still  preserves  the  property  that  if  there  exists  a 
feasible  solution  at  initialization,  there  exits  a  feasible  solution  to  all  subsequent  distributed 
optimizations  problems.  The  initialization  phase  of  the  distributed  implementation,  however, 
now  requires  a  better  guess  for  the  assumed  controls,  namely,  one  that  is  known  to  be  feasible. 
Still,  this  requirement  is  no  stronger  than  that  imposed  on  the  centralized  implementation;  if 
the  centralized  problem  has  an  initial  feasible  solution,  subsequent  feasibility  can  be  ensured. 
If  an  initially  feasible  solution  to  the  centralized  problem  is  available,  it  could  be  used  to 
define  the  assumed  controls  at  initialization  in  the  distributed  case.  For  the  asymptotic 
stability  results  to  hold,  the  coupling  constraints  must  satisfy  the  stated  conditions  on  the 
set  Z.  It  may  be  of  interest  to  see  if  coupling  in  the  dynamics,  as  with  the  problem  addressed 
in  [14],  could  also  be  addressed  by  the  formulation  here. 

The  theory  will  ultimately  be  applied  to  the  Caltech  Multi- Vehicle  Wireless  Testbed  [7]. 
Other  venues  for  application  of  the  theory  may  exist,  for  example,  in  dynamic  formulations 
of  resource  allocation  problems  in  networks,  or  in  dynamic  game  theoretic  settings.  For 
instance,  the  approach  by  Baglietto  et  al  [2]  for  distributed  dynamic  routing  in  a  network 
could  be  compared  to  a  discrete-time  version  of  onr  distributed  receding  horizon  control  law. 
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